Commit Graph

4 Commits

Author SHA1 Message Date
Karl-Aksel Puulmann d620432227
Channels: Speed up clickhouse calculations (#4789)
* Fix interpolation in data_migration.ex

* Speed up calculating acquisition_channel in clickhouse

The previous `has` queries proved to be problematic and causing a lot of
CPU overhead.

Benchmarked via this query:

```sql
SELECT
  channel,
  count(),
  countIf(acquisition_channel(referrer_source, utm_medium, utm_campaign, utm_source, click_id_param) = channel) AS matches
FROM events_v2
WHERE timestamp > now() - toIntervalHour(48)
GROUP BY channel
ORDER BY count() desc
```

Before this fix:
```
query_duration_ms:                                                57960
DiskReadElapsedMs:                                                374.712
RealTimeMs:                                                       2891200.667
UserTimeMs:                                                       2704024.783
SystemTimeMs:                                                     1693.265
OSCPUWaitMs:                                                      90.253
OSCPUVirtualTimeMs:                                               2705709.58
```

After this fix:
```
query_duration_ms:                                                4367
DiskReadElapsedMs:                                                454.356
RealTimeMs:                                                       213892.207
UserTimeMs:                                                       199363.485
SystemTimeMs:                                                     1479.364
OSCPUWaitMs:                                                      13.739
OSCPUVirtualTimeMs:                                               200837.37
```

Note that the new tables are not tracked in our schema as usual as
they're pretty much temporary tables to create the dictionary without
needing to upload files to clickhouse servers.

* CREATE OR REPLACE table with SELECT
2024-11-11 10:39:51 +00:00
Karl-Aksel Puulmann ee3d1e770e
APIv2: visit:country_name, visit:region_name, visit:city_name dimensions (#4328)
* Add data migration for creating and syncing location_data table and dictionary

* Migration to populate location data

* Daily cron to refresh location dataset if changed

* Add support for visit:country_name, visit:region_name and visit:city_name dimensions

Under the hood this relies on a `location_data` table in clickhouse being regularly synced with
plausible/location repo and dictionary lookups used in ALIAS columns

* Update queue name

* Update documentation

* Explicit structs

* Improve docs further

* Migration comment

* Add queues

* Add error when already loaded

* Test for filtering by new dimensions

* Update deps

* dimension -> select_dimension

* Update a test
2024-08-13 09:44:58 +03:00
Adam 6d79ca5093
Switch to new clickhouse adapter (ch/chto) (#2733)
* another clickhouse adapter

* don't restore stats_removal.ex

* fix events main-graph error (#2746)

* update ch, chto

* update chto again (#2759)

* Stop treating page filter as an entry_page filter (#2752)

* remove dead code

* stop treating page filter as entry page filter in breakdown queries

* stop treating page filter as entry page filter in aggregate queries

* stop treating page filter as entry page filter in timeseries queries

* mix format

* update changelog

* break code down to smaller functions to keep credo happy

* remove unused functions

* make CSV export return only conversions with goal filter (#2760)

* make CSV export return only conversions with goal filter

* update changelog

* update elixir version in mix.exs (#2742)

* revert admin.ex changes (#2776)

---------

Co-authored-by: ruslandoga <67764432+ruslandoga@users.noreply.github.com>
Co-authored-by: ruslandoga <rusl@n-do.ga>
Co-authored-by: RobertJoonas <56999674+RobertJoonas@users.noreply.github.com>
2023-03-21 09:55:59 +01:00
Uku Taht 4b36bb7138
Use clickhouse_ecto for db connection (#317)
* Use clickhouse-ecto for stats

* Use clickhouse ecto instead of low-level clickhousex

* Remove defaults from event schema

* Remove all references to Clickhousex

* Document configuration change

* Ensure createdb and migrations can be run in a release

* Remove config added for debug

* Update plausible_variables.sample.env
2020-09-17 16:36:01 +03:00