* Migration: add custom propos to goals + revisit unique constraints
* Update constraints in goal schema (and move module)
* Add a comment, not really related but useful?
* Implement querying for goals with custom props
* Optimize goal_join_data (down to one iteration) + include goal custom props
* Test goal custom propos addition + new constraints
* Test querying for goals with custom propos attached
* Test funnels made of goals with custom props
* Format
* Fixup test name
* Fixup migration
* Unified goal join macro
* Remove dupe test
* Clean up user_id usage
* Fixup test to match the description
* Revert "Temporary: make room for pre/post migration constraint names (#5942)"
This reverts commit e4bc6b8715.
---------
Co-authored-by: Uku Taht <uku.taht@gmail.com>
* Include revenue data for all detailed API responses except entry/exit pages
* Expose revenue data in all breakdown modals except entry/exit pages
* Add revenue metrics to breakdown response only on EE
* Change query builder to enable querying event metrics \w session dimension
* Add revenue metrics to entry and exit pages breakdowns
* Expose revenue data in entry and exit pages breakdowns
* Use `argMax` for `exit_page` and `exit_page_hostname` dimensions (h/t @ukutaht)
* Don't handle event-only dimensions with session-only metrics for now
* Add tests for all breakdowns
* Add clarifying comments in code
* Mark revenue tests as EE-only
* Refactor table_decider#partition_metrics
* Refactor query pipeline to return a list of subqueries after splitting
* Move order_by out of join logic
* Refactor joining logic in query_builder
1. JOIN type is now set in QueryOptimizer
2. JOIN logic is now table and list-size agnostic
* Comment an edge case
* Rebuild session/visit smearing
Previously, whenever graphing any visit metric hourly/realtime, visit_duration and other
visit metrics would be way higher than expected, due to long sessions
dragging each bucket up and up. Now visits/visitors metrics are still
smeared and other visit metrics are counted under last bucket user was
active in.
visits metric was also overcounted (see new tests).
* Remove unneeded case
* Unit test for smearing in tabledecider
* Support passing `include` as a query parameter for dashboard APIs
* Mark time-on-page metric sortable
It now is thanks to the changed query
* new-time-on-page flag with cutoff being sent to the frontend
* Add correct tooltip title
* Implement metric warning for when legacy and new time_on_page metrics are mixed
* Send legacy_time_on_page_cutoff to backend
* Make time-on-page graphable with the new metric
* Only show metric warnings for time_on_page if flag is enabled
* Changelog
* Solve an clickhouse error when querying timeseries with only legacy time-on-page
* Add tests for timeseries of new time-on-page
Along the way fix an issue with comparisons not working properly
* Solve a typing issue
* Allow toggling legacy_time_on_page_cutoff off in dashboard
* Slightly better workaround
* Solve typing issue
* Prettier
* Guard against no warning
* Solve warning
* Default to time_on_page
* Add new columns to schema
* Read from new column in legacy query
* Read/write new imported_pages columns
* Remove time_on_page column from imported_pages
* Simple, stupid new_time_on_page metric
* Update csv_importer schema
* Refactor: consistent __internal helpers, this will help with joining the query
* Refactor select_joined_metrics
* Refactor: pass `query` to event_metric
* Refactor: remove needless site argument from various calls
* Legacy joining query attempt
* Move test around
* Add more tests for both legacy and new time_on_page metrics in query API
* time_on_page reported in seconds
* timeseries test for metric
* WIP
* Wrap main query in subquery - without this run into trouble performing the join
* Calculate time_on_page in main query, no more new_time_on_page
* Add some TODOs
* Return NULL over 0 when no visits with time-on-page data
* Update moduledoc
* Update some tests that were not expecting integers
* Add a TODO
* Update tests
* Make graphing time series with combined metrics work.
* Slightly more consistent approach to flag updating in APIv2
* Seeds with engagement data
* Make graphing time series when cutoff is in the middle work
Bakes less assumptions into everything as well.
* Rename to legacy_time_on_page_cutoff
* Fixup lib/plausible_web/controllers/api/external_query_api_controller.ex
* Remove a todo and dead/misleading code
* Remove a resolved todo
* Remove needless rounding
* gen types
* Update pages test
* Remove needless columns from select
* Update tests: timestamps and remove comment
* Flip branches
* Revert "Disable scroll depth exports temporarily"
This reverts commit 48ad691f53.
* Remove support for pageleave events being equivelent to engagement in ingestion
* Explicit column ordering inside csv imports
Subtle change, but this ensures that CSVs that contain extra columns or differently named columns do not cause trouble
* Add scale_sample fragment helper
* Update scroll depth queries to be based on visits rather than visitors
* Add test demonstrating session-based results
* Update csv test (session vs user-based difference)
* Attempt to update csv tests
* PR feedback
* migration: add scroll_threshold to goals
* update goal schema
* setup simple UI for creating scroll goals
* add ability to filter and breakdown scroll goals
* fix goals form tests
* add valiation for page path exists
* move todo comments to expression.ex
* move tests
* make it clear that scroll_threshold is optional
* avoid calling Plausible.Goal.type() too many times
* do not consider 255 scroll depth a conversion
* migration: add scroll_threshold to goals
* do not drop the old index yet
* More efficient goals join again
* Refactor: move goals stats code explicitly under Stats.Goals module
* Move code under Plausible.Stats.Goals
* 254 -> 100
* add scroll_threshold field to goal schema + new unique constraint
* adjust test to test what it claims to
* mix format
* add migration
* consider imported query unsupported when page scroll goal filter
* add missing tests
* pattern match imported argument
* silence credo
* Update lib/plausible/stats/sql/expression.ex
Co-authored-by: Karl-Aksel Puulmann <macobo@users.noreply.github.com>
* use site_imports populated in test setup
---------
Co-authored-by: Karl-Aksel Puulmann <oxymaccy@gmail.com>
Co-authored-by: Karl-Aksel Puulmann <macobo@users.noreply.github.com>
* migration: add scroll_depth to events_v2
* (cherry-pick) ingest scroll depth
* replace convoluted test with more concise ones
* QueryParser: parse internal scroll_depth metric + validation
* turn QueryComparisonsTest into QueryInternalTest
* rename file
* (cherry pick) query scroll depth 15b14d3
...and move the tests into `internal_query_test.exs`
* review feedback
* Get rid of unnecessary separation between aggregate and group scroll depth
* Drop irrelevant other metrics in tests
* add test ensuring scroll depth unavailable in Stats API v1
* Put scroll depth on the dashboard
* Top Stats
* Main Graph
* Top Pages > Details
* feature flag for dashboard scroll depth access
* ignore credo warning
* enable scroll_depth flag in tests
* remove duplication
* write timestamps explicitly in a test
* revert moving tests around
* Add query_comparisons_test back
* Move scroll_depth tests into query_test
* Delete query_internal_test
* rename setup util (got updated on master)
* use pageleave_factory where applicable
* Use the correct generated query-api.d.ts
* npm format
* add experimental pageleave script variant
* also send pageleave events on SPA navigation
* disallow goals with 'pageleave' event name
* do not count pageleaves towards the event metric
* remove duplication in test file
* do not update sessions on pageleave events
* ignore pageleaves in the current time_on_page implementation
* make pageleave events not billable
* rename function
* Prevent multiple pageleaves being sent at the same time
* query.date_range is now in UTC instead of user timezone
This simplifies things down the line and fixes several bugs where
query.date_range is cast to naivedatetime for ecto purposes
Many places still remain broken:
- comparison queries
- `to_date_range` calls
* Make default_for_date_range not care about time zones
* Make timezone parameter mandatory for to_date_range
* Simplify utc_date_range, update legacy query builder
* Fix more cases where query date range is needed
* query.date_range -> query.utc_time_range
* Query.date_range/1 function
* ensure_include_imported update
* Clean up send_email_report
* Safeguard session queries relying on `sign` from faulty old session entries
* Comment updated metric
Co-authored-by: Karl-Aksel Puulmann <macobo@users.noreply.github.com>
* Apply safeguards to `bounce_rate` metric only
* Add note to bounce rate definition in SQL fragments as well
* Add test for graceful bounce rate handling in breakdown
* Make user_id more unique
* Add a note to the test
* Move regression test to APIv2 tests
---------
Co-authored-by: Karl-Aksel Puulmann <macobo@users.noreply.github.com>
* add realtime date_ranges into the private API schema
This commit starts parsing date ranges into a new NaiveDateTimeRange
struct, rather than a simple Date.Range.
* transform realtime labels into negative integers + test
* move schema type argument to last position in helper functions
* allow passing a date param + tests
* Update test/plausible/stats/query_parser_test.exs
Co-authored-by: Karl-Aksel Puulmann <macobo@users.noreply.github.com>
* Update test/plausible/stats/query_parser_test.exs
Co-authored-by: Karl-Aksel Puulmann <macobo@users.noreply.github.com>
* Update test/plausible/stats/query_parser_test.exs
Co-authored-by: Karl-Aksel Puulmann <macobo@users.noreply.github.com>
* Update test/plausible/stats/query_parser_test.exs
Co-authored-by: Karl-Aksel Puulmann <macobo@users.noreply.github.com>
* keep test file structure consistent
* Turn NaiveDateTimeRange into DateTimeRange
* change 'now' field from NaiveDateTime to DateTime in v2 query
* fix minute interval labels + add missing tests
* return query_result.date_range as iso8601 timestamps with timezone
* allow timestamps with tz as date_range arguments in API v2
* delete Plausible.Timezones.to_utc_datetime
* simplify returning comparison periods
* add comment about realtime not supported in comparisons
* pass only now instead of test_opts
* drop redundant else branch
* separate tests
* stick to a single check_date_range function in tests
* fix credo error
---------
Co-authored-by: Karl-Aksel Puulmann <macobo@users.noreply.github.com>
* Add data migration for creating and syncing location_data table and dictionary
* Migration to populate location data
* Daily cron to refresh location dataset if changed
* Add support for visit:country_name, visit:region_name and visit:city_name dimensions
Under the hood this relies on a `location_data` table in clickhouse being regularly synced with
plausible/location repo and dictionary lookups used in ALIAS columns
* Update queue name
* Update documentation
* Explicit structs
* Improve docs further
* Migration comment
* Add queues
* Add error when already loaded
* Test for filtering by new dimensions
* Update deps
* dimension -> select_dimension
* Update a test
* Refactor Expression.dimension to accept q
* Handle quarter- and half-hour timezones
Previously APIv2 output didn't start at a full hour for these time zones
and main graph was blank
The core reasoning is that ClickHouse `timeSlots` is not time-zone
aware and works off of unix epoch - meaning that in time zones which
have an offset of 5:45 the "hours" reported would start at minute :45.
The fix is kind of silly - we now divide each hour into 4 and handle
things that way.
Related basecamp issue: https://3.basecamp.com/5308029/buckets/36789884/card_tables/cards/7590936581
* Fix test typo
* Move fragments module under Plausible.Stats.SQL
* Introduce select_merge_as macro
This simplifies some select_merge calls
* Simplify select_join_fields
* Remove a needless dynamic
* wrap_select_columns macro
* Move metrics from base.ex to expression.ex
* Move WhereBuilder under Plausible.Stats.SQL
* Moduledoc
* Improved macros
* Wrap more code
* select_merge_as more
* Move defp to the end
* include.time_labels parsing
* include.time_labels in result
Note that the previous implementation of the labels from TimeSeries.ex was broken
* Apply consistent function in imports and timeseries.ex
* Remove boilerplate
* WIP: Limited support for timeseries-with-querybuilder
* time:week dimension
* cleanup: property -> dimension
* Make querying with time series work
* Refactor: Move special metrics (percentage, conversion rate) to own module
* Explicitly format datetimes
* Consistent include_imported in special metrics
* Solve week-related crash
* conversion_rate hacking
* Keep include_imported consistent after splitting the query
* Simplify do_decide_tables
* Handle time dimensions in imports cleaner
* Allow time dimensions in custom property queries
* time:week handling continued
* cast_revenue_metrics_to_money
* fix `full_intervals` support
* Handle minute/realtime graphs
* experimental_session_count? with timeseries
This becomes required as we try to include visits from sessions by default
* Support hourly data in imports
* Update bounce_rate in more csv tests
* Update some time-series query tests
* Fix for meta.warning being included incorrectly
* Simplify imported.ex
* experimental_session_count flag removal
* moduledoc
* Split interval and time modules
* Move fragments module under Plausible.Stats.SQL
* Introduce select_merge_as macro
This simplifies some select_merge calls
* Simplify select_join_fields
* Remove a needless dynamic
* wrap_select_columns macro
* Move metrics from base.ex to expression.ex
* Move WhereBuilder under Plausible.Stats.SQL
* Moduledoc
* Improved macros
* Wrap more code
* select_merge_as more
* Move defp to the end
* wrap_alias
* Revert "Revert "APIv2: Replace breakdown module with QueryBuilder (#4283)" (#4292)"
This reverts commit ef5e0e0382.
* Allow querying events and pageviews from sessions table
This is not strictly accurate, especially with shorter time frames, but
is useful for a fallback mechanism. I'll figure out something around
shorter time frames in the future.
See also: https://github.com/plausible/analytics/pull/4292
* Only query events and pageviews in legacy breakdowns
* WIP: Breakdown using QueryBuilder
* Revert "Remove problematic test"
This reverts commit b442bb5d1f.
* Get more breakdown tests passing
* Preload goals, sort when dealing with time_on_page
* Handle conversion_rate in breakdowns
* Simplify ordering by using selected_as consistently for dimensions
* Get breakdown tests passing
* Strings to atoms in keys for StatsController.transform_keys calls to work
* Handle revenue metrics removal
* Add test for nil-removal case
* Include percentage metric
* Fix and test with imported locations
* Fixup time-on-page
* Fix country/region automatic filters
* Handle multiple imports (os/browser version) in importsv2
* Filter goals
* Default to ordering by page as well
* Calculate conversion rate on sessions if needed
* Order by event dimensions - handles event:page special case
* Update tests
* Update more tests, handle goal=0 case in imports
* Handle event:goal breakdowns correctly with filters
* Revenue to money
* Improved table deciding
* Also update event:page filters on event:page breakdown
* bounce_rate to 0
Previous behavior relied on two queries being made - new query leads to 0 naturally
* Update pagination test
* dont count non-pageviews as path goal completions
* Make revenue logic breakdown-specific
Its hard to fit into the new schema and likely needs a rethink for apiv2
* Retain previous behavior for TimeSeries module
* Get GA4 test passing
Most failures are related to ordering, pageviews shouldnt be read off of sessions
* Clean up old methods
* Simplify imported.ex
* Dont crash on garbage filters
* Reflect ordering-related change in test
* Fix test data
* Update table_decider
* Re-simplify get_revenue_tracking_currency
* Revert revenue changes
* Use Query.set
* Remove a TODO
* csv importer: no pageviews
Pageviews were incorrectly fetched from sessions table before, causing issues
* csv importer tweaking
* Remove use Plausible
* to_existing_atom
* Add some aggregates tests
* Port aggregates tests to do with filtering
* Session metrics can be queried with event: filters
* Solve a typo
* Update a validation message
* Add validations for views_per_visit
* Port an aggregation/imports test
* Optimize time dimension, add tests
* Add first timeseries test, update parsing tests
* Docs for SQL.Expression
* Test timeseries more
* Allow time explicitly in order_by
* Add multiple breakdowns test
* Refactor QueryOptimizer not to care about time dimension placement in dimensions array
* Add test breaking down by event:hostname
* Add hostname filtering logic to QueryOptimizer, unblock some tests
* WIP: Breakdown by goal
* conversion rate logic for query api
* Update more tests
* Set default order_by
* dimension_label
* preloaded_goals in tests
* inline load_goals
* Use Date functions over Timex
* Comments
* is_binary
* Remove special form used in tests
* Fix defmodule
* WIP: Fix memory leak, event:page breakdown logic
* Enable more tests, fix for group_conversion_rate without explicit visitors metric
* Re-enable a partially commented test
* Re-enable a partially commented test
* Get last test passing
* No imports order_by in apiv2
* Add a TODO
* Remove redundant Util call
* Update aggregate.ex
* Remove problematic test
* WIP new querying
* WIP: Move some aggregate code under new command
* WIP: Add joins, handling less metrics
* join events table to sessions if needed
* Merge imported results with built query
* Remove dead code
* WIP: /api/v2/query
* Allow grouping by time
* Use JOIN for main query
* Build query result
* update parse_time
* Make joinless order by work
* First test
* more breakdown tests
* Serialize event:goal filters in an json-encodable way/reflection
* Handle inner vs outer ORDER BY clauses properly
* Handle single conversion_rate metric
* Update more tests
* Get parsing tests passing again
* Validate filtered goal filter is configured
* Enable more validation tests
* Enable more event:name breakdown tests
* Enable more breakdown tests
* Validate site has access to custom props
* Validate conversion_rate metric which is only allowed in some situations
* Validate that empty event:props: is not valid
* handle query.dimensions properly in table_decider
* test more validations on metrics/dimensions
* Validate session metrics in combination with event dimension(s)
* Tests cleanup
* Parse include.imports
* Get imports working with new querying
* Make more imports tests work
* Make event:props:path imports-adjacent test work
* Get query imports warning-related tests running
* Remove dead pagination tests
* Solve dead import
* Solve some warnings
* Update aggregate metrics tests
* credo
* Improve test naming
* Lazy goal loading
* Use datetime methods
* Ecto -> SQL module name
* Remove Expression.dimension mode option