At this point, we have covered many of the basics of SQL queries, including how to begin to put together complex queries. This assignment will have you perform data analytics tasks against a simplified business database using the skills you have learned.
This database contains 5 tables with approximately 45,000 rows of data.
Scenario: You are performing analytics for “Cycle,” an e-bike rental company specializing in providing on-demand e-bikes to customers all over the city. (Comparable Real Companies: Bird, Lime)
Your task is to find answers for some business questions by querying their database.
§1 1. Start the Database Instances #
- Extract the project archive to a new directory.
- Open a terminal window and navigate into that directory.
- Confirm you are in the correct place by running
ls:
$ ls
docker-compose.yaml init README.md
- Launch the database container:
docker-compose up
This command will start the PostgreSQL instances.
Logs will stream to your terminal; leave this window open while you work.
$ docker-compose up
Attaching to postgres-a4
postgres-a4 | The files belonging to this database system will be owned by user "postgres".
postgres-a4 | This user must also own the server process.
postgres-a4 |
postgres-a4 | The database cluster will be initialized with locale "en_US.utf8".
postgres-a4 | The default database encoding has accordingly been set to "UTF8".
postgres-a4 | The default text search configuration will be set to "english".
postgres-a4 |
postgres-a4 | Data page checksums are disabled.
postgres-a4 |
postgres-a4 | fixing permissions on existing directory /var/lib/postgresql/data ... ok
postgres-a4 | creating subdirectories ... ok
postgres-a4 | selecting dynamic shared memory implementation ... posix
...
To shut down the databases after completing the assignment, press Ctrl + C (⌘ + C on MacOS) in this same terminal window.
If you accidentally close the window, simply reopen a terminal and navigate bask to the assignment directory, then run docker-compose up again to re-attach.
§2 2. Connect to the Database #
- To connect to the part 1 database, run:
docker exec -it postgres-a4 psql - This will immediately launch the psql CLI against the database.
- Use SQL commands to find answers to the business questions.
- Write down the SQL commands you use to answer each question, along with the final value.
- Type
exitto leave the database session when you are done.
§3 3. Stop the Database Instance #
Do not shut down the container until you have finished the assignment.
If you accidentally stop the database, repeat the commands from section 1.
When ready:
- Return to the terminal running
docker-compose up, you will see the output logs from the database instance. - Press
Ctrl + Cto gracefully stop both containers. - Verify nothing is running:
$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
# empty output
§4 4. Submission #
Create a plain-text submission on canvas with numbered sections for each question:
Answer each question with the requested information, followed by the SQL query/queries you used.
§5 5. Questions #
Monthly Revenue Produce one row per calendar month (YYYY-MM) showing total revenue from rides rounded to nearest cents. (Use
TO_CHAR(<DATE_VALUE_HERE>, 'YYYY-MM')for date formatting)
Expected columns: month, revenue_dollars.Monthly Customer Signups
Find the number of new customers who signed up each month in 2024.
Columns: month, new_customers
HINT: you can compare datesdate_value >= DATE '2025-09-01'Top-5 Stations by Net Traffic
Net traffic = (rides that started here) + (rides that ended here).
List the five busiest stations, their city, and net traffic count.Average Ride Duration by Gender
Return one row per gender with the average ride duration in minutes rounded to three decimals.Payment-Type Totals
Provide the total revenue by payment type in dollars.
Columns: payment_type, total_revenue.Payment-Type Market Share
Give the percentage share of revenue by payment type.
HINT: you can calculate the percent aggregate across the entire table withOVER ()
e.g.ROUND(SUM(p.amount_cents) * 100.0 / SUM(SUM(p.amount_cents)) OVER (), 2)
Columns: payment_type, share_pct.Station Utilization
Find the top 5 most utilized stations
Return columns: station_name, total_rides.Missing Revenue
Find the difference between the ride cost and value paid by all customers (In dollars).Lost Revenue from “Voucher” Failures
Assume that voucher payments fail 5% of the time and are refunded immediately.
Estimate how much revenue (in dollars) Cycle-Rentals “lost” in the month of2025-09because of voucher failures.
HINT: you can compare datesdate_value >= DATE '2025-09-01'Next-Best-Station Recommendation
For Station-1, find the station most frequently paired as the destination for rides starting at Station-1.
Return: recommended_station_name, frequency.Long Rides
Count the number of rides that lasted longer than 60 minutes.Most-Traveled customers
Find the top 10 customers that have traveled the longest distance.
Output: customer_id, first_name, last_name, lifetime_travel_distance
You can calculate the distance (in miles) from two coordinate pairs with this formula:
2 * 3961 * ASIN(
SQRT(
POWER(SIN((RADIANS(s2.latitude - s1.latitude)) / 2), 2) +
COS(RADIANS(s1.latitude)) * COS(RADIANS(s2.latitude)) *
POWER(SIN((RADIANS(s2.longitude - s1.longitude)) / 2), 2)
)
)