mongo/buildscripts/cost_model
Max Verbinnen 9813c77669 SERVER-107790 Design workload for SORT_MERGE (#41084)
GitOrigin-RevId: ab90fe3618437be7a0f4861c8fb7e5647f8c68e0
2025-09-10 15:06:07 +00:00
..
.gitignore SERVER-107356 Create workload for calibrating the cost of LIMIT (#38543) 2025-07-15 18:56:23 +00:00
BUILD.bazel SERVER-100631 Add ruff into "bazel run lint" (#32166) 2025-06-11 14:11:11 +00:00
OWNERS.yml SERVER-103079 Replace granular QO code owners teams with a single coarse team (#34298) 2025-04-10 06:25:19 +00:00
README.md SERVER-85932 Update README for cost model calibration setup (#40665) 2025-08-28 16:25:03 +00:00
benchmark.py SERVER-106251 Modify cost model calibration code to be able to parse classic execution trees (#37438) 2025-06-20 20:55:36 +00:00
calibration_settings.py SERVER-107790 Design workload for SORT_MERGE (#41084) 2025-09-10 15:06:07 +00:00
ce_data_settings.py SERVER-94077 Use isort in Ruff configs (#27865) 2024-10-10 19:33:49 +00:00
ce_generate_data.py SERVER-104999 port motor to pymongo async (#36697) 2025-05-30 13:07:39 +00:00
common.py SERVER-106251 Modify cost model calibration code to be able to parse classic execution trees (#37438) 2025-06-20 20:55:36 +00:00
config.py SERVER-108290 Support multiple calibrations for single node (#39235) 2025-07-30 18:58:10 +00:00
cost_estimator.py SERVER-107790 Design workload for SORT_MERGE (#41084) 2025-09-10 15:06:07 +00:00
data_generator.py SERVER-104999 port motor to pymongo async (#36697) 2025-05-30 13:07:39 +00:00
database_instance.py SERVER-107786 Create workload for calibration of COLLSCAN node (#38773) 2025-07-23 19:31:00 +00:00
end_to_end.py SERVER-106252 Remove ABT from experiment pipeline and complete test run (with no data) (#37595) 2025-06-27 17:34:28 +00:00
execution_tree_classic.py SERVER-107790 Design workload for SORT_MERGE (#41084) 2025-09-10 15:06:07 +00:00
execution_tree_sbe.py SERVER-106251 Modify cost model calibration code to be able to parse classic execution trees (#37438) 2025-06-20 20:55:36 +00:00
experiment.py SERVER-107786 Create workload for calibration of COLLSCAN node (#38773) 2025-07-23 19:31:00 +00:00
mongod-inmemory.yaml SERVER-96080: Remove references to bonsai in buildscripts/cost_model (#32641) 2025-04-09 22:23:09 +00:00
mongod.yaml SERVER-96080: Remove references to bonsai in buildscripts/cost_model (#32641) 2025-04-09 22:23:09 +00:00
parameters_extractor_classic.py SERVER-107790 Design workload for SORT_MERGE (#41084) 2025-09-10 15:06:07 +00:00
parameters_extractor_sbe.py SERVER-106252 Remove ABT from experiment pipeline and complete test run (with no data) (#37595) 2025-06-27 17:34:28 +00:00
physical_tree.py SERVER-94077 Use isort in Ruff configs (#27865) 2024-10-10 19:33:49 +00:00
qsn_calibrator.py SERVER-107790 Design workload for SORT_SIMPLE and SORT_DEFAULT (#40970) 2025-09-08 09:50:31 +00:00
qsn_costing_parameters.py SERVER-107790 Design workload for SORT_MERGE (#41084) 2025-09-10 15:06:07 +00:00
query_solution_tree.py SERVER-106251 Modify cost model calibration code to be able to parse classic execution trees (#37438) 2025-06-20 20:55:36 +00:00
random_generator.py SERVER-94077 Use isort in Ruff configs (#27865) 2024-10-10 19:33:49 +00:00
requirements.txt SERVER-104999 port motor to pymongo async (#36697) 2025-05-30 13:07:39 +00:00
start.py SERVER-107790 Design workload for SORT_MERGE (#41084) 2025-09-10 15:06:07 +00:00
workload_execution.py SERVER-107786 Create workload for calibration of COLLSCAN node (#38773) 2025-07-23 19:31:00 +00:00

README.md

Cost Model Calibrator

Getting Started

1) Setup Mongod

First, prepare the MongoDB server:

  1. Activate the standard virtual environment:
source python3-venv/bin/activate
  1. Build server with optimizations (makes doc insertion faster):
(python3-venv) bazel build --config=opt install-devcore
  1. Run mongod instance:
(python3-venv) bazel-bin/install-mongod/bin/mongod --setParameter internalMeasureQueryExecutionTimeInNanoseconds=true

2) Setup Cost Model Calibrator

In another terminal:

  1. Navigate to the cost model directory:
cd buildscripts/cost_model
  1. Set up Python alias to use MongoDB toolchain:
alias python=/opt/mongodbtoolchain/v4/bin/python3
  1. Deactivate any existing Python environment (if needed):
deactivate
  1. Create new virtual environment:
/opt/mongodbtoolchain/v4/bin/python3 -m venv cm
  1. Activate the new environment:
source cm/bin/activate
  1. Install required packages:
(cm) python -m pip install -r requirements.txt
  1. Run the calibrator:
(cm) python start.py

Note: For the first time it will take a while since it has to generate the data. Afterwards, as long as you aren't modifying the collections, you can comment out await generator.populate_collections() in start.py - this will make it a lot faster.

  1. When done, deactivate the environment:
(cm) deactivate

Install New Packages

  1. Install the package:
(cm) python -m pip install <package_name>
  1. Update requirements.txt:
(cm) python -m pip freeze > requirements.txt