* Replace usages of `Timex.to_unix` with native API
* Wrap call to `Timex.is_valid_timezone?`
* Wrap calls to `Timex.today(tz)`
* Replace `Timex.today()` with `Date.utc_today()`
* Replace `Timex.now()` with `DateTime.utc_now()`
* Replace `Timex.compare` with `Date.compare`
* Wrap `Timex.diff` calls
* Replace `Timex.Timezone.convert` with `DateTime.shift_zone!`
* Wrap `Timex.parse!`
* Replace `Timex.to_date` with native API calls
* Replace `Timex.beginning|end_of...` with native API calls for Date
* Wrap `Timex.beginning|end_of...` for DateTimes and Dates for years
* Replace `Timex.format(!)` with native API calls
* Replace `Timex.to_naive_datetime` with native API calls
* Wrap time humanizing routines using Timex
* Remove unnecessary `use Timex` instances
* Replace `Timex.shift` with native API calls
* Make `QueryParser.parse_date` handle gaps and ambiguities gracefully
* Replace `Timex.now(tz)` with `DateTime.now!(tz)`
* Use a more suitable Date function for comparison (h/t @aerosol)
* query.date_range is now in UTC instead of user timezone
This simplifies things down the line and fixes several bugs where
query.date_range is cast to naivedatetime for ecto purposes
Many places still remain broken:
- comparison queries
- `to_date_range` calls
* Make default_for_date_range not care about time zones
* Make timezone parameter mandatory for to_date_range
* Simplify utc_date_range, update legacy query builder
* Fix more cases where query date range is needed
* query.date_range -> query.utc_time_range
* Query.date_range/1 function
* ensure_include_imported update
* Clean up send_email_report
* add realtime date_ranges into the private API schema
This commit starts parsing date ranges into a new NaiveDateTimeRange
struct, rather than a simple Date.Range.
* transform realtime labels into negative integers + test
* move schema type argument to last position in helper functions
* allow passing a date param + tests
* Update test/plausible/stats/query_parser_test.exs
Co-authored-by: Karl-Aksel Puulmann <macobo@users.noreply.github.com>
* Update test/plausible/stats/query_parser_test.exs
Co-authored-by: Karl-Aksel Puulmann <macobo@users.noreply.github.com>
* Update test/plausible/stats/query_parser_test.exs
Co-authored-by: Karl-Aksel Puulmann <macobo@users.noreply.github.com>
* Update test/plausible/stats/query_parser_test.exs
Co-authored-by: Karl-Aksel Puulmann <macobo@users.noreply.github.com>
* keep test file structure consistent
* Turn NaiveDateTimeRange into DateTimeRange
* change 'now' field from NaiveDateTime to DateTime in v2 query
* fix minute interval labels + add missing tests
* return query_result.date_range as iso8601 timestamps with timezone
* allow timestamps with tz as date_range arguments in API v2
* delete Plausible.Timezones.to_utc_datetime
* simplify returning comparison periods
* add comment about realtime not supported in comparisons
* pass only now instead of test_opts
* drop redundant else branch
* separate tests
* stick to a single check_date_range function in tests
* fix credo error
---------
Co-authored-by: Karl-Aksel Puulmann <macobo@users.noreply.github.com>
* Extend the GSC API with search functionality
* Fix typo in error tuple atom
* log GSC errors
* rename confusing variable name
* Fix the API response format and error handling
* Read results under the `results` key to be consistent with other
endpoints
* Return errors with a non-200 status code, and with an error payload
that will be well constructed into ApiError
* rebuild Google Keywords modal with useAPIClient
* Add pagination support in Search Terms API
* delete unused fixture file
* rename fixture files
* fix tests
* New struct format for query after parsing
* WIP refactoring
* WIP: Validations working
* WIP: tuple to list
* continued refactoring
* WIP: parsing defaults
* Breakdown tests pass
* Window functions fix
* Fix default
* Remove dead argument
* Update filters tests
* Update query_test.exs
* Fix table_decider
* sources tests pass
* Filter suggestions fix
* revenue/goal filter applied refactor
* Update top_stats matching
* Get stats_controller tests passing
* Update neighbor_aggregate_time_on_page
* Refactor Query.remove_event_filters into Query.remove_filters, add new callsites
* Move goal where clause building to new WhereBuilder module
* Move event:name filters
* Move more filters to WhereBuilder
* Update fragment to allow non-static meta columns
* Build where clause for events table using WhereBuilder
* Build sessions table where clause using WhereBuilder
* Move time range filtering and site checking to WhereBuilder
* WhereBuilder.build_condition method
* Remove TODO
* _rest pattern for TableDecider, Query pattern matching
Future-proofing in a tiny way
* Hacky fix to get tests passing for Google API tests
* Typespec fix
* Merge conflict
* refactor special goal filter logic in imported.ex
* Docs feedback
* put_filter
---------
Co-authored-by: Robert Joonas <robertjoonas16@gmail.com>
* Apply filters in search console request
* Remove dead code from search console modal
* Remove unimportant information from keyword modal
* Show invalid filters from search console
* Fix tests
* Add/Fix tests
* Fix typo
* Remove unused variable
* Fix typo
* Changelog entry
* Fix Credo
* Display impressions, CTR and position in keyword modal
* Undo change that should not have been committed
* Fix test
* Fix test
* filters -> search_console_filters
* Implement adjusting imported date range to actual and existing stats
* Drop redundant prefix from import list entries
* Make pageview numbers in imports list formatted for readability
* Test and improve date range cropping
* DRY UA and GA4 stats start and end date API calls
* Extend UA/GA import controller tests and improve error handling
* refactor finding longest open range without existing data
* Fix typo in test description
Co-authored-by: RobertJoonas <56999674+RobertJoonas@users.noreply.github.com>
* Rename `open_ranges` to `free_ranges`
---------
Co-authored-by: Robert Joonas <robertjoonas16@gmail.com>
Co-authored-by: RobertJoonas <56999674+RobertJoonas@users.noreply.github.com>
* Unify GA4 and UA import flow into one
* Clean up property and view data retrieval via Google HTTP APIs
* Turn `Map.get` into `Map.fetch!` in API response processing code
* Bump list account summaries page size limit to max of 200
* Show only views in legacy flow and fix legacy redirect after import start
* Move google analytics import actions tests to a separate module
* Extend Google Analytics controller tests
* DRY up `property?` predicate (h/t @RobertJoonas)
* Implement LV date input using flatpickr
* Implement basics of GA4 import (very dirty WIP)
* Split Google HTTP API into UA and GA4 specific parts
* Add a quick way to record GA4 API responses
* Add first GA4 import fixtures with GA4 Data API responses
* Extract GA4 and UA specific logic form Google API
* Extract UA and GA4 specific actions to distinct controllers
* Add integration test for GA4 importer
* Update GA4 fixtures
* Test GA4 API
* Add debug logging and fix paginating through API results in in GA4 import
* Revert "Implement LV date input using flatpickr"
This reverts commit c696f8ee39d5702f27015c09a4f079ca124cc7bb.
* Fix note
* Create a stub of site settings section for imports and exports
* Use legacy site import indication to determine UA import handling
* Add provisional logos for upcoming import sources
* Stub basics of import page
* Add very rudimentary support for multiple UA imports
* Implement imports list as live view
* Add support for opening LV modal from backend and closing from frontend
* Introduce notion of themes to `button` and `button_link` components
* Add confirmation modal on deleting import
* Swap GA4 logo
* Implement disabled state support for `button_link` component
* Disable export and non-implemented import sources
* Use native starts start date for upper boundary of import time range
* Ensure integrations view uses legacy UA import flow
* Remove unnecessary preload in SiteController
* Remove unnecessary exception for legacy imports
* Move API controller stats tests under PlausibleWeb
* Test listing imports
* Add test for explicit listener setup
* Add tests for legacy flag state in UA importer
* Add test for purging legacy import data
* Add tests for `Sites.native_stats_start_date`
* Test forgetting imports
* Add `Stats.Clickhouse.imported_pageview_counts/1` and fix test flakiness
* Show page view counts on imports list
* Add tests for static imports and exports view
* Adjust button look slightly
* Use `case` instead of `cond`
* Make feature flag customisable per site
* Fix buttons and empty state styling
* Add another import to seeds
* Use JS confirm dialog instead of modal for deletion confirmations
* Revert "Add support for opening LV modal from backend and closing from frontend"
This reverts commit 260e6c753032b451542e24be9edc2118790b5a00.
* Default `legacy` to false when inserting new import jobs
* Drop `method` attribute from `button_link` and `unstyled_link` components
* Move imported tables schemas to separate modules outside Google ns
* Move buffer for imports to Imported ns
* fix schema newlines
* Extract UA import processing and persistence
* Decouple analytics worker implementation from UA
* Rename env variable for import buffer size
* Preserve old import queue until release
This commit fixes a bug where Google Analytics import tokens were not
being refreshed properly because the function was not returning the
expected tuple. Thanks to @aerosol we can nicely test this now.
* Make TestUtils module available in all tests
* Add macros patching the application env in tests
Unfortunately a lot of existing functionality relies on
certain application env setup. This isn't ideal because
the app config is a shared state that prevents us from
running the tests in parallel.
Those macros encapsulate setting up new env for test purposes
and make sure the changes are reverted when the test finishes.
* Allow passing request opts to HTTPClient.post/4
We need this to swap custom request building in
Google Analytics import.
* Unify errors when listing sites
* React: propagate backend error messages if available
* React: catch API errors in Search Terms component
* Propagate google API errors on referrer drilldown
* Handle verified properties errors in SC settings
* Add missing tests for SC settings controller
* Unify errors for fetching search analytics queries (list stats)
* Unify errors refreshing Google Auth Token
* Test fetch_stats/3 errors and replace Double with Mox
* Fixup makrup
* s/class/className
* Simplify Search Terms display in case of errors
* Fix warnings
The Google Analytics report request may take some time, especially with big
imports with thousands of pages. To mitigate the issue this commit increases
the timeout to 60s and lowers the page size to 7,500 records per request.
* Return error tuple instead of exception when GA request fails
* Report Google Analytics failed request body to Sentry
* Improve Google Analytics function @doc
* fixup! Report Google Analytics failed request body to Sentry
* fixup! Improve Google Analytics function @doc
* List all Google Analytics views during import
This commit fixes a bug where different Google Analytics views with the
same name and URI were not shown. This was caused because GA views were
stored as a map, that naturally doesn't support duplicate keys.
This change updates the GA views list to display view IDs, making it
clearer to know what is being imported. The dropdown is now grouped by
website URL.
* Put Google Analytics API URLs in app env
* Add controller test to GA view list
This commit fixes a bug where fetching Google Search Console keywords
raised a FunctionClauseError. This was introduced in #2046. This commit
also adds test coverage.
* Create separate module for GA HTTP requests
* Fetch GA data entirely instead of monthly
* Add buffering to GA imports
* Change positional args to maps when serializing from GA
* Create Google Analytics VCR tests
* Add has_imported_stats boolean to Site
* Add Google Analytics import panel to general settings
* Get GA profiles to display in import settings panel
* Add import_from_google method as entrypoint to import data
* Add imported_visitors table
* Remove conflicting code from migration
* Import visitors data into clickhouse database
* Pass another dataset to main graph for rendering in red
This adds another entry to the JSON data returned via the main graph API
called `imported_plot`, which is similar to `plot` in form but will be
completed with previously imported data. Currently it simply returns
the values from `plot` / 2. The data is rendered in the main graph in
red without fill, and without an indicator for the present. Rationale:
imported data will not continue to grow so there is no projection
forward, only backwards.
* Hook imported GA data to dashboard timeseries plot
* Add settings option to forget imported data
* Import sources from google analytics
* Merge imported sources when queried
* Merge imported source data native data when querying sources
* Start converting metrics to atoms so they can be subqueried
This changes "visitors" and in some places "sources" to atoms. This does
not change the behaviour of the functions - the tests all pass unchanged
following this commit. This is necessary as joining subqueries requires
that the keys in `select` statements be atoms and not strings.
* Convery GA (direct) source to empty string
* Import utm campaign and utm medium from GA
* format
* Import all data types from GA into new tables
* Handle large amounts of more data more safely
* Fix some mistakes in tables
* Make GA requests in chunks of 5 queries
* Only display imported timeseries when there is no filter
* Correctly show last 30 minutes timeseries when 'realtime'
* Add with_imported key to Query struct
* Account for injected :is_not filter on sources from dashboard
* Also add tentative imported_utm_sources table
This needs a bit more work on the google import side, as GA do not
report sources and utm sources as distinct things.
* Return imported data to dashboard for rest of Sources panel
This extends the merge_imported function definition for sources to
utm_sources, utm_mediums and utm_campaigns too. This appears to be
working on the DB side but something is incomplete on the client side.
* Clear imported stats from all tables when requested
* Merge entry pages and exit pages from imported data into unfiltered dashboard view
This requires converting the `"visits"` and `"visit_duration"` metrics
to atoms so that they can be used in ecto subqueries.
* Display imported devices, browsers and OSs on dashboard
* Display imported country data on dashboard
* Add more metrics to entries/exits for modals
* make sure data is returned via API with correct keys
* Import regions and cities from GA
* Capitalize device upon import to match native data
* Leave query limits/offsets until after possibly joining with imported data
* Also import timeOnPage and pageviews for pages from GA
* imported_countries -> imported_locations
* Get timeOnPage and pageviews for pages from GA
These are needed for the pages modal, and for calculating exit rates for
exit pages.
* Add indicator to dashboard when imported data is being used
* Don't show imported data as separately line on main graph
* "bounce_rate" -> :bounce_rate, so it works in subqueries
* Drop imported browser and OS versions
These are not needed.
* Toggle displaying imported data by clicking indicator
* Parse referrers with RefInspector
- Use 'ga:fullReferrer' instead of 'ga:source'. This provides the actual
referrer host + path, whereas 'ga:source' includes utm_mediums and
other values when relevant.
- 'ga:fullReferror' does however include search engine names directly,
so they are manually checked for as RefInspector won't pick up on
these.
* Keep imported data indicator on dashboard and strikethrough when hidden
* Add unlink google button to import panel
* Rename some GA browsers and OSes to plausible versions
* Get main top pages and exit pages panels working correctly with imported data
* mix format
* Fetch time_on_pages for imported data when needed
* entry pages need to fetch bounces from GA
* "sample_percent" -> :sample_percent as only atoms can be used in subqueries
* Calculate bounce_rate for joined native and imported data for top pages modal
* Flip some query bindings around to be less misleading
* Fixup entry page modal visit durations
* mix format
* Fetch bounces and visit_duration for sources from GA
* add more source metrics used for data in modals
* Make sources modals display correct values
* imported_visitors: bounce_rate -> bounces, avg_visit_duration -> visit_duration
* Merge imported data into aggregate stats
* Reformat top graph side icons
* Ensure sample_percent is yielded from aggregate data
* filter event_props should be strings
* Hide imported data from frontend when using filter
* Fix existing tests
* fix tests
* Fix imported indicator appearing when filtering
* comma needed, lost when rebasing
* Import utm_terms and utm_content from GA
* Merge imported utm_term and utm_content
* Rename imported Countries data as Locations
* Set imported city schema field to int
* Remove utm_terms and utm_content when clearing imported
* Clean locations import from Google Analytics
- Country and region should be set to "" when GA provides "(not set)"
- City should be set to 0 for "unknown", as we cannot reliably import
city data from GA.
* Display imported region and city in dashboard
* os -> operating_system in some parts of code
The inconsistency of using os in some places and operating_system in
others causes trouble with subqueries and joins for the native and
imported data, which would require additional logic to account for. The
simplest solution is the just use a consistent word for all uses. This
doesn't make any user-facing or database changes.
* to_atom -> to_existing_atom
* format
* "events" metric -> :events
* ignore imported data when "events" in metrics
* update "bounce_rate"
* atomise some more metrics from new city and region api
* atomise some more metrics for email handlers
* "conversion_rate" -> :conversion_rate during csv export
* Move imported data stats code to own module
* Move imported timeseries function to Stats.Imported
* Use Timex.parse to import dates from GA
* has_imported_stats -> imported_source
* "time_on_page" -> :time_on_page
* Convert imported GA data to UTC
* Clean up GA request code a bit
There was some weird logic here with two separate lists that really
ought to be together, so this merges those.
* Fail sooner if GA timezone can't be identified
* Link imported tables to site by id
* imported_utm_content -> imported_utm_contents
* Imported GA from all of time
* Reorganise GA data fetch logic
- Fetch data from the start of time (2005)
- Check whether no data was fetched, and if so, inform user and don't
consider data to be imported.
* Clarify removal of "visits" data when it isn't in metrics
* Apply location filters from API
This makes it consistent with the sources etc which filter out 'Direct /
None' on the API side. These filters are used by both the native and
imported data handling code, which would otherwise both duplicate the
filters in their `where` clauses.
* Do not use changeset for setting site.imported_source
* Add all metrics to all dimensions
* Run GA import in the background
* Send email when GA import completes
* Add handler to insert imported data into tests and imported_browsers_factory
* Add remaining import data test factories
* Add imported location data to test
* Test main graph with imported data
* Add imported data to operating systems tests
* Add imported data to pages tests
* Add imported data to entry pages tests
* Add imported data to exit pages tests
* Add imported data to devices tests
* Add imported data to sources tests
* Add imported data to UTM tests
* Add new test module for the data import step
* Test import of sources GA data
* Test import of utm_mediums GA data
* Test import of utm_campaigns GA data
* Add tests for UTM terms
* Add tests for UTM contents
* Add test for importing pages and entry pages data from GA
* Add test for importing exit page data
* Fix module file name typo
* Add test for importing location data from GA
* Add test for importing devices data from GA
* Add test for importing browsers data from GA
* Add test for importing OS data from GA
* Paginate GA requests to download all data
* Bump clickhouse_ecto version
* Move RefInspector wrapper function into module
* Drop timezone transform on import
* Order imported by side_id then date
* More strings -> atoms
Also changes a conditional to be a bit nicer
* Remove parallelisation of data import
* Split sources and UTM sources from fetched GA data
GA has only a "source" dimension and no "UTM source" dimension. Instead
it returns these combined. The logic herein to tease these apart is:
1. "(direct)" -> it's a direct source
2. if the source is a domain -> it's a source
3. "google" -> it's from adwords; let's make this a UTM source "adwords"
4. else -> just a UTM source
* Keep prop names in queries as strings
* fix typo
* Fix import
* Insert data to clickhouse in batches
* Fix link when removing imported data
* Merge source tables
* Import hostname as well as pathname
* Record start and end time of imported data
* Track import progress
* Fix month interval with imported data
* Do not JOIN when imported date range has no overlap
* Fix time on page using exits
Co-authored-by: mcol <mcol@posteo.net>