Commit Graph

29 Commits

Author SHA1 Message Date
Artur Pata 2fbdd10bd4 Refactor to have CE and EE script caches, fix CE cache 2025-12-17 20:15:39 +02:00
RobertJoonas 7540511deb
Clean up detection sentry events + tests (#5833)
* add module name to service_error when check times out

Otherwise, it can sometimes remain unclear in the diagnostics, whether
it was InstallationV2 or InstallationV2CacheBust that timed out.

* Remove duplicate timeout logic

The current production logs show two types of verification timeouts:

* service_error: "Unhandled Browserless response status: 408" (vast
  majority of cases)
* service_error: :timeout (only a few cases)

The latter happens when we hit the Req receive_timeout
(endpoint_timeout + 2s). I've seen Browserless not respect the timeout
param from time to time, so it's better to keep the timeout logic
"in-house" only.

* make service_error into a map with code and extra

* interpret temporary service errors

...but still consider them "unhandled" for telemetry, also notifying Sentry
and logging the warning.

* separate sentry messages (verification)

* make Verification.ChecksTest more DRY

* organize tests into describe blocks

* test verification telemetry and logging

* fix codespell

* get rid of legacy verification

* rename Checks.InstallationV2 -> Checks.VerifyInstallation

* delete Live.Installation and rename Live.InstallationV2 -> Live.Installation

* rename installationv2 (live) files as well

* delete old change-domain routes

Also rename current liveview modules and routes, removing the v2 suffix

* rename domain_change_v2 files, removing v2 suffix

* remove legacy JS verifier code

Also fix dockerignore and elixir.yml referencing a wrong priv path

* rename verification_v2_test -> verification_test

* remove v2 prefix from logs and sentry messages

* clean up duplicate external_sites_controller_test.exs tests

* remove flag

* fix typespec

* pass timeout as query param to Browserless too

* Fixup external sites controller test module (#5826)

* fix test description

* clean up detection sentry events + tests

* improve naming

---------

Co-authored-by: Artur Pata <artur.pata@gmail.com>
2025-10-27 10:31:24 +00:00
RobertJoonas a83b4f3583
Clean up legacy verification code and script v2 flag (#5824)
* add module name to service_error when check times out

Otherwise, it can sometimes remain unclear in the diagnostics, whether
it was InstallationV2 or InstallationV2CacheBust that timed out.

* Remove duplicate timeout logic

The current production logs show two types of verification timeouts:

* service_error: "Unhandled Browserless response status: 408" (vast
  majority of cases)
* service_error: :timeout (only a few cases)

The latter happens when we hit the Req receive_timeout
(endpoint_timeout + 2s). I've seen Browserless not respect the timeout
param from time to time, so it's better to keep the timeout logic
"in-house" only.

* make service_error into a map with code and extra

* interpret temporary service errors

...but still consider them "unhandled" for telemetry, also notifying Sentry
and logging the warning.

* separate sentry messages (verification)

* make Verification.ChecksTest more DRY

* organize tests into describe blocks

* test verification telemetry and logging

* fix codespell

* get rid of legacy verification

* rename Checks.InstallationV2 -> Checks.VerifyInstallation

* delete Live.Installation and rename Live.InstallationV2 -> Live.Installation

* rename installationv2 (live) files as well

* delete old change-domain routes

Also rename current liveview modules and routes, removing the v2 suffix

* rename domain_change_v2 files, removing v2 suffix

* remove legacy JS verifier code

Also fix dockerignore and elixir.yml referencing a wrong priv path

* rename verification_v2_test -> verification_test

* remove v2 prefix from logs and sentry messages

* clean up duplicate external_sites_controller_test.exs tests

* remove flag

* fix typespec

* pass timeout as query param to Browserless too

* Fixup external sites controller test module (#5826)

* fix test description

---------

Co-authored-by: Artur Pata <artur.pata@gmail.com>
2025-10-27 09:39:41 +00:00
Adrian Gruntkowski f3ccfde980
Adjust persistor metrics buckets and remove decode measurement (#5831) 2025-10-27 08:12:09 +00:00
Adrian Gruntkowski 0abad8b0ab
Add metric for decode duration in remote persistor client (#5821) 2025-10-23 11:52:27 +00:00
Adrian Gruntkowski 3d1f1eca8e
Ensure `conn` from `Plug.Conn.read_body` is always passed down the pipeline (#5814)
* Ensure `conn` from `Plug.Conn.read_body` is always passed down the pipeline

* Alter persistor related histogram metrics for better view of timings

* Update typespec
2025-10-20 11:54:52 +00:00
Adrian Gruntkowski daf1c1a9cd
Measure total `Persistor.Remote` request duration, outside `Finch` (#5811) 2025-10-16 13:03:23 +00:00
Adrian Gruntkowski e097afea8f
Implement extended metrics for persistor client (#5791)
* Implement conversion of finch telemetry events to persistor specific ones

* Implement metrics and remove unused telemetry

* Adjust buckets

* Adjust buckets again and use milliseconds for unit uniformly
2025-10-09 15:12:31 +00:00
RobertJoonas 3de4aa54d4
Installation support telemetry (#5790)
* detection handled/unhandled telemetry

* telemetry for verification too

* move sentry call next to telemetry event

* fix ce compile warning

* fix case clause

* remove implicit nil

* telemetry_event functions without argument
2025-10-09 14:13:58 +00:00
Adrian Gruntkowski f52bf3feab
Configure promex metrics sent by `Persistor.EmbeddedWithRelay` (#5762) 2025-10-01 07:26:30 +00:00
RobertJoonas 9ed79a831e
Move InstallationSupport code to `/extra` (EE only) folder (#5758)
* move lib/plausible/installation_support/ -> /extra/lib/...

* extract Live.AwaitingPageviews exclusively for CE

* VerificationTest to ee only

* fix the rest of the compile/test errors on CE

* fix warning about not using default for optional argument

* move module attr
2025-09-29 14:01:56 +00:00
Karl-Aksel Puulmann cf423dbf99
ScriptV2: TrackerScriptCache on ee (#5648)
* Leverage TrackerScriptCache on ee

On ee, TrackerScriptCache only stores valid ids. This is then leveraged
to do no database queries when looking up tracker scripts for
non-existing ids.

For smoother onboarding purposes, refresh frequency for the script is also
reduced.

Note that the cache layout is not optimal (storing 'true' booleans) but
being more optimal would require changing the underlying cache
implementation significantly.

I tested out the cache - with 1M tracker script configs, it seems to be
~12MB in size.

* Wait on cache

* Add telemetry

* Remove cleverness in trying to reuse code
2025-08-19 11:41:19 +00:00
RobertJoonas 97dcc3fe7c
Refactor Verification module structure (#5570)
* detector.js

* refactor: organize modules better

* Renaming (Elixir + JS)

* lib/plausible/verification -> lib/plausible/installation_support
* test/plausible/verification -> test/plausible/installation_support
* priv/tracker/verifier -> priv/tracker/installation_support
* tracker/verifier -> tracker/installation_support
* tracker/test/verifier -> tracker/test/installation-support

* rename remaining test modules

* add documentation

* dialyzer: remove module refs that do not exist yet

* Fix CI

* fix tracker CI

* fix tracker CI for good
2025-07-15 10:50:34 +00:00
RobertJoonas b76996b3a4
Verification v2 (#5549)
* new verifier script with tests + telemetry

* dataDomainMismatch tests

* more tests for callbackStatus and plausibleInstalled

* create priv/verifier subfolder + fix Elixir CI

* bump CI cache version

* organize verifier tests

* Remove accidentally committed verifier

* Rework compilation: Make it a variant, always return new verifier code in tests

* Make priv/tracker/verifier/ exist

* Handle static checks with grace

* Fix paths

* Fix paths

* Add some tests

* Add one more test

* split up the JS

* proxyLikely + code structure refactor + unit tests

* fix telemetry fields

* move most telemetry to logs

* run verifier tests only on chromium

* detect wordpressPlugin and wordpressLikely

* detect GTM

* rename JS checks

* detect cookiebot

* include new fields in logs

* different logs for browserless request vs js failures

* detect manual extension

* detect unknown attrs + fix logging

* stick to Elixir checks for snippet detection

* fix codespell

* fix IO.inspect

* remove unnecessary fields from test mock

* cookiebot doc

* move test into verifier subfolder

* do not duplicate ts types

* comma -> semicolon in log

* test dynamically loaded snippet

* improve logging on Browserless error

---------

Co-authored-by: Karl-Aksel Puulmann <oxymaccy@gmail.com>
2025-07-14 14:32:21 +00:00
Karl-Aksel Puulmann 692fd30a3e
Add telemetry to tracker script generation (#5437)
This will be used to measure rollout success and load after purges
2025-06-03 07:11:47 +00:00
hq1 aa4a8339cb
Ingest throughput fixes (#5378)
* Update

* Update

* Naive safety valve in front of RL

* Revert "Naive safety valve in front of RL"

This reverts commit 3bb553ec2e.

* rate limit with atomics

* update test

* Reapply "Add +Mdai max emulator flag (#5373)" (#5374)

This reverts commit b28ca2ffee.

* Update load script

* Update LOADTEST mode

* Revert "Stop aggregating buffered ingest counters (#5372)"

This reverts commit 2c41dcd4c1.

* update

* Fix cache hit/miss metric tags

---------

Co-authored-by: ruslandoga <ruslandoga+gh@icloud.com>
2025-05-05 14:00:37 +00:00
hq1 ffae16f7b9
Stop Cache.Stats + Revert "Temporarily disable ingest metrics (#5369)" (#5370)
* Revert "Temporarily disable ingest metrics (#5369)"

This reverts commit b96e96a7f6.

* Add :tools to MIX_ENV=dev

* Stop tracking caches hit ratio in favour of raw counters
2025-04-30 08:11:51 +00:00
hq1 4821c9489e
Change session transfer duration unit (#5338)
* Change session transfer duration unit

* Update buckets

* Change metric name to start clean
2025-04-17 05:10:33 +00:00
ruslandoga 9bfb1992d9
Sessions transfer (#5229)
* sessions transfer

* took on ignore

* add slow test

* more tests

* allow BIG messages

* update tests

* continue

* continue

* cleanup

* fewer changes

* add config tests

* todo

* use less app env

* oops

* fixes

* cleanup

* update tests

* remove useless tests

* more buckets

* Update lib/plausible/session/transfer.ex

* Update transfer.ex

* Update lib/plausible/session/transfer.ex

* avoid calling into ConCache directly

* cleanup, add docs

* Update lib/plausible/session/transfer.ex

* fewer options

* there is no loop

* force deploy

* force deploy again

* individual puts

* fewer changes in cache/adapter.ex

* oops

* use existing atom names

* Cosmetic changes

* Bring back slow tag

---------

Co-authored-by: hq1 <hq@mtod.org>
2025-04-16 09:56:39 +00:00
hq1 b45d5e90c1
Supervise user agent parsing (#5243)
* Supervise user agent parsing

* Instrument UA parse timeout with a counter

* Test
2025-03-27 10:47:19 +00:00
Adrian Gruntkowski 8a1c6e0913
Enforce sequential processing of session events (#4493)
* Create a regression demonstration test for race condition

* Use `ConCache.isolated/1` to force sequential processing of session events

* Revise comment in regression test

* Put lock call behind cache adapter API

* Add more explicit handling of failing lock

NOTE: Apparent double execution of lock function needs to be investigated.

* Improve slow lock cases tests

* Reduce number of session cache locks and instrument them w/ telemetry

* Format

---------

Co-authored-by: Adam Rutkowski <hq@mtod.org>
2024-09-03 09:29:32 +02:00
hq1 7523abe93e
Add metrics to ingestion pipeline (#3927)
* Add metrics to ingestion pipeline

* Format

* Format

* Update buckets

* Credo
2024-03-26 09:42:48 +01:00
hq1 59afa20955
Reapply #3878 + bugfix hit rate tracking (#3891)
* Reapply "Replace caching engine (#3878)" (#3883)

This reverts commit c5881cdc6d.

* Ensure hit rate is tracked on `get_or_store`

* Remove :wx and :observer

* Remove unused deps

* Use `:set` table type
2024-03-14 08:06:12 +01:00
hq1 c5881cdc6d
Revert "Replace caching engine (#3878)" (#3883)
This reverts commit 437a3350ff.
2024-03-12 08:30:16 +01:00
hq1 437a3350ff
Replace caching engine (#3878)
* Dependencies: swap Cachex for ConCache

* Implement Cache adapter wrapping ConCache

* Implement cache stats tracker, for metrics

* Use Cache.Adapter in Plausible.Cache

Marking the test as not slow anymore

* Use Cache Adapter when tracking sessions

* Use Cache Adapter for UA parsing

* Rename child identifiers - cachex is obsolete now

* Test stats tracking

* Update grafana metrics

* Put all caches under common child specification

* Try less

* Shorten the function delegation path
2024-03-12 07:58:12 +01:00
Adam Rutkowski 356575ef78
Gatekeep ingestion pipeline (#2472)
* Update Sites.Cache

So it's now capable of refreshing most recent sites.
Refreshing a single site is no longer wanted.

* Introduce Warmer.RecentlyUpdated

This is Sites Cache warmer that runs only for
most recently updated sites every 30s.

* Validate Request creation early

* Rename RateLimiter to GateKeeper and introduce detailed policies

* Update events API tests - a provisioned site is now required

* Update events ingestion tests

* Make limits visible in CRM Sites index

* Hard-deprecate DOMAIN_BLACKLIST

* Remove unnecessary clause

* Fix typo

* Explicitly delegate Warmer.All

* GateKeeper.allwoance => GateKeeper.check

* Instrument Sites.Cache measurments

* Update send_pageview task to output response headers

* Instrument ingestion pipeline

* Credo

* Make event telemetry test a sync case

* Simplify Request.uri/hostname handling

* Use embedded schema, apply action and rely on get_field
2022-11-28 15:50:55 +01:00
Adam Rutkowski 9364cebb4b
Fix up Cachex metrics (#2418)
* Fix cachex size metrics

* Collect Cachex hit ratio for UAs/Sessions

* Revert profiling metrics from 101e5a68
2022-11-03 12:28:02 +01:00
Adam Rutkowski 101e5a68b5
Allow Site DB lookups during ingestion phase (#2408)
* Implement FF-driven DB lookup for sites during ingestion

We like to see the impact of doing a simple postgres lookup on each
ingestion event. The percentage-based feature flag `:ingestion_pg_lookup`
must be set in order for lookups to be executed.

* Fix resolving Cachex stats metrics

* Enable PromEx on dev env
2022-11-01 17:11:50 +02:00
Manu S Ajith c0c36646e2
Add Custom telemetry for Plausible.Event.WriteBuffer, Plausible.Event.WriteBuffer and Cachex (#2095)
* Add Custom telemetry for Plausible.Event.WriteBuffer, Plausible.Event.WriteBuffer and Cachex

Signed-off-by: Manu S Ajith <neo@codingarena.in>

* Rename telemetry.ex to avoid confusion with Phoenix Telemetry supervisor

Signed-off-by: Manu S Ajith <neo@codingarena.in>

* Remove duplicate event

Signed-off-by: Manu S Ajith <neo@codingarena.in>

Signed-off-by: Manu S Ajith <neo@codingarena.in>
2022-08-12 09:50:18 +03:00