Seller Performance Analysis¶
Order Fulfillment Efficiency¶
16. What is the average time gap between order_approved_at and order_delivered_carrier_date for each seller? Who are the 'Bottom 10%' of sellers who consistently take the longest to hand over packaged orders to the logistics partner?
Solution:
-
Firstly, we calculate the dispatch time per order (
seller_lead_timesCTE): We join theordersandorder_itemstables to calculate the exactdispatch_hoursfor each order by finding the difference betweenorder_approved_atandorder_delivered_carrier_date. We use minutes divided by 60.0 to safely prevent integer overflow errors, and filter out 'canceled' or 'unavailable' orders. -
Secondly, we aggregate and filter seller performance (
seller_performance_summaryCTE): Grouping byseller_id, we calculate theavg_dispatch_hoursand the total number of orders fulfilled. Crucially, we filter out negative time values (potential data entry errors) and use aHAVINGclause to only include sellers with at least 5 orders, ensuring our averages are statistically relevant. -
Thirdly, we segment sellers into deciles (
seller_efficiency_ranksCTE): We use thentile(10)window function ordered byavg_dispatch_hours desc. This divides the sellers into 10 equal buckets, placing the absolute slowest 10% of sellers into bucket 1. -
Finally, we isolate the worst offenders (Final
SELECT): We filter the results to only includeefficiency_decile = 1(the bottom 10%). We format the output to show both the average dispatch hours and days, highlighting the sellers who consistently take the longest to hand over packages to the logistics partner.
Show SQL Code
with seller_lead_times as (
-- step 1: calculate hours between approval and carrier handover
-- FIX: Changed 'second' to 'minute' to prevent integer overflow from data anomalies
select
oi.seller_id,
o.order_id,
dispatch_hours = datediff(minute, o.order_approved_at, o.order_delivered_carrier_date) / 60.0
from
orders o
inner join
order_items oi on o.order_id = oi.order_id
where
o.order_status not in ('canceled', 'unavailable')
and o.order_approved_at is not null
and o.order_delivered_carrier_date is not null
),
seller_performance_summary as (
-- step 2: aggregate averages per seller
select
seller_id,
count(distinct order_id) as total_orders_fulfilled,
avg(dispatch_hours) as avg_dispatch_hours
from
seller_lead_times
where
dispatch_hours >= 0 -- filtering out potential data entry errors
group by
seller_id
having
count(distinct order_id) >= 5 -- filtering for sellers with enough history to be statistically relevant
),
seller_efficiency_ranks as (
-- step 3: segment sellers into deciles based on speed
-- ntile(10) with order by desc puts the slowest 10% into bucket 1
select
*,
efficiency_decile = ntile(10) over (order by avg_dispatch_hours desc)
from
seller_performance_summary
)
-- final output: isolating the "bottom 10%" slowest sellers
select
seller_id,
total_orders_fulfilled,
cast(avg_dispatch_hours as decimal(10,2)) as avg_dispatch_hours,
cast(avg_dispatch_hours / 24.0 as decimal(10,2)) as avg_dispatch_days
from
seller_efficiency_ranks
where
efficiency_decile = 1 -- selecting the slowest decile
order by
avg_dispatch_hours desc;
Results:
| seller_id | total_orders_fulfilled | avg_dispatch_hours | avg_dispatch_days |
|---|---|---|---|
| ecccfa2bb93b34a3bf033cc5d1dcdc69 | 14 | 622.66 | 25.94 |
| 6cf476a4ca74498db55cbccdaa9dcfb6 | 5 | 537.57 | 22.40 |
| 5d378b73ab7dd6f0418d743e5dcb0bd1 | 7 | 520.27 | 21.68 |
| fa74b2f3287d296e9fbd2cc80f2d1cf1 | 5 | 431.49 | 17.98 |
| 66e0557ecc2b4dbea057e93f215f68d8 | 30 | 430.22 | 17.93 |
| 4b1eaadf791bdbbad8c4a35b65236d52 | 5 | 427.52 | 17.81 |
| 6ee85be3693ed79a8e80718743d80655 | 5 | 406.15 | 16.92 |
| ed859002ad59dbf8cf3602696a6c3000 | 7 | 401.96 | 16.75 |
| 54965bbe3e4f07ae045b90b0b8541f52 | 73 | 401.74 | 16.74 |
| cf8ab1616079e2793aa29d524df01bb1 | 6 | 375.38 | 15.64 |
| 5058e8c1e82653974541e83690655b4a | 62 | 368.89 | 15.37 |
| c990d6cf976a5718aaedc539f383ab88 | 9 | 364.59 | 15.19 |
| b5abf4f36adc043117b4fca82c22984c | 8 | 360.96 | 15.04 |
| 633ecdf879b94b5337cca303328e4a25 | 6 | 360.05 | 15.00 |
| 7fc87cc3e89b3d1d5cabdca32f8485aa | 14 | 353.33 | 14.72 |
| 38874e327ce94d11390b96eb42d61928 | 6 | 347.97 | 14.50 |
| 817f85dbb65aa3e70831d90fe75cdf89 | 11 | 345.08 | 14.38 |
| b1b3948701c5c72445495bd161b83a4c | 15 | 338.28 | 14.10 |
| 6fd52c528dcb38be2eea044946b811f8 | 67 | 331.49 | 13.81 |
| 8bd0e3abda539b9479c4b44a691be1ec | 6 | 330.62 | 13.78 |
| 822b63912576852aea9a8436d72317b7 | 6 | 311.81 | 12.99 |
| 538caafddff204241cecbf3a02e6b3cf | 8 | 310.68 | 12.95 |
| 8603f0038fe0e52fedd3382d266723de | 10 | 303.20 | 12.63 |
| c42fd8e4d47dfb18ce5222f2dd7752f9 | 7 | 300.21 | 12.51 |
| 8d46553a36e68f95350a200c12f8f2e2 | 9 | 290.26 | 12.09 |
| 95b293867b5862941c9cd74f756a3c68 | 9 | 289.23 | 12.05 |
| 6d988d6174a2c27441597174f8905515 | 19 | 288.83 | 12.03 |
| 17f51e7198701186712e53a39c564617 | 56 | 281.81 | 11.74 |
| adcf50477d6a1f5d91bb9d54c4903cb5 | 18 | 281.75 | 11.74 |
| b19f3ca2ea475913750f25a5c37c8d8f | 19 | 279.30 | 11.64 |
| 7c67e1448b00f6e969d365cea6b010ab | 980 | 278.39 | 11.60 |
| cee48807215b30a12ca2ca10ffb5f250 | 39 | 273.07 | 11.38 |
| d32e3909cfb714fb2a6cb092c85b9c5f | 10 | 269.76 | 11.24 |
| 8444e55c1f13cd5c179851e5ca5ebd00 | 93 | 262.63 | 10.94 |
| 973f21788dfab357250f69a8dcb7ddee | 9 | 262.33 | 10.93 |
| ad781527c93d00d89a11eecd9dcad7c1 | 39 | 261.46 | 10.89 |
| ea6b12bf9ffe2bac34602ec631d97a47 | 9 | 259.74 | 10.82 |
| d71d863e5ef30d94e440c11be17dcd8f | 23 | 259.41 | 10.81 |
| 312ba1d77e9c332ef21f9598b7f64cd7 | 7 | 258.87 | 10.79 |
| 4cf490a58259286ada5ba8525ba9e84a | 8 | 257.20 | 10.72 |
| a7f13822ceb966b076af67121f87b063 | 75 | 256.97 | 10.71 |
| f1ed6bd0a9b11b581f16c851c6a5a527 | 17 | 256.92 | 10.70 |
| a2e874074c877c5a05abae80ad6e488f | 40 | 255.92 | 10.66 |
| 2eb70248d66e0e3ef83659f71b244378 | 188 | 253.07 | 10.54 |
| 20b54c376b794ed028df09a3cd88e8dc | 7 | 243.37 | 10.14 |
| 579891617139df7d8671d373f0669622 | 7 | 243.31 | 10.14 |
| a425f92c199eb576938df686728acd20 | 16 | 242.70 | 10.11 |
| b55638ad525e906c698fa2ce742c1742 | 11 | 239.22 | 9.97 |
| 1b8b75e227c9a9c100d0c210fb6176ce | 17 | 236.98 | 9.87 |
| 02d35243ea2e497335cd0f076b45675d | 14 | 234.78 | 9.78 |
| a2fa0bdc798ee84cdb08281337cf4fb6 | 8 | 233.81 | 9.74 |
| 3faf68a3b0af94b10bac70d86077be49 | 7 | 232.27 | 9.68 |
| 30c7f28fd3a5897b2c82d152bb760c17 | 6 | 230.73 | 9.61 |
| 70b52a4cfc823994561b00bad161b4ed | 8 | 228.05 | 9.50 |
| 610f72e407cdd7caaa2f8167b0163fd8 | 15 | 221.48 | 9.23 |
| 835f0f7810c76831d6c7d24c7a646d4d | 42 | 219.52 | 9.15 |
| 2089a6d640999f9b9141ac719b2af596 | 42 | 219.48 | 9.14 |
| 054694fa03fe82cec4b7551487331d74 | 21 | 215.54 | 8.98 |
| 38102b031c2a15e54623d711bfc753d3 | 9 | 205.01 | 8.54 |
| f4c4daa86e30c7e5a553a8d518ac03a5 | 18 | 204.63 | 8.53 |
| f1b93673502375d491780bb49d615dbc | 8 | 199.69 | 8.32 |
| 93dc87703c046b603023e75222018b45 | 35 | 198.00 | 8.25 |
| ec4608a1f76453166bb312b2968aeaf4 | 17 | 197.64 | 8.24 |
| 54219883e72aad869adfb2a54b7bfa0f | 5 | 196.32 | 8.18 |
| f76a3b1349b6df1ee875d1f3fa4340f0 | 24 | 194.54 | 8.11 |
| 5f67c6082caacb26e431a7b17940cece | 13 | 192.03 | 8.00 |
| 87e24fc7052259b1f7f607d61c10d8be | 7 | 191.80 | 7.99 |
| 33576ec5412fb5905d876f12f33bfde6 | 31 | 191.75 | 7.99 |
| c37b2059d4f90d4feead554e5246565e | 12 | 191.61 | 7.98 |
| 4917cee8d902e13428c3ec4b1ca6f315 | 31 | 191.43 | 7.98 |
| 88460e8ebdecbfecb5f9601833981930 | 243 | 191.06 | 7.96 |
| 9baf5cb77970f539089d09a38bcec5c3 | 31 | 189.97 | 7.92 |
| dc8798cbf453b7e0f98745e396cc5616 | 37 | 186.29 | 7.76 |
| 89bc797e2e29667aa035d4368f9b7f92 | 17 | 185.90 | 7.75 |
| b335c59ab742f751a85db9c411a86739 | 30 | 184.34 | 7.68 |
| 7ecd59e5e20407131822c1a68ac59c1f | 24 | 184.16 | 7.67 |
| 9539213aa8e023c01937bef95dc006da | 9 | 184.06 | 7.67 |
| 81336a0f57a33776419727a300249e90 | 16 | 183.55 | 7.65 |
| 834f8533b2ecb6598dd004ff3de7203a | 41 | 182.89 | 7.62 |
| 38e6dada03429a47197d5d584d793b41 | 7 | 182.66 | 7.61 |
| 8f78f0903005064036736c7173a5c2ed | 13 | 181.55 | 7.56 |
| 6bd69102ab48df500790a8cecfc285c2 | 5 | 180.75 | 7.53 |
| 41ab63a91b8b264e8c8780368bf1dd5b | 12 | 180.66 | 7.53 |
| f419677537694f0462a91990df6ca44f | 6 | 179.79 | 7.49 |
| 213b25e6f54661939f11710a6fddb871 | 153 | 178.06 | 7.42 |
| ab3e0c171fe84a7ba7de130f19cfb485 | 6 | 175.02 | 7.29 |
| 784ba75dd9d20200c4caed3d7a77141a | 19 | 173.63 | 7.23 |
| 741f8c587cb4248ebc5b779a0e8e0e51 | 5 | 173.47 | 7.23 |
| cb41bfbcbda0aea354a834ab222f9a59 | 11 | 171.51 | 7.15 |
| 6562efe88ce0826a4ca4f189f03b4b84 | 9 | 171.48 | 7.15 |
| f7496d659ca9fdaf323c0aae84176632 | 25 | 170.90 | 7.12 |
| 5151aea44289d6c6b090ee31c2132508 | 7 | 170.73 | 7.11 |
| c60b801f2d52c7f7f91de00870882a75 | 40 | 169.90 | 7.08 |
| 5882820ef11c4c068eea0677a7a879b4 | 14 | 168.89 | 7.04 |
| ffff564a4f9085cd26170f4732393726 | 8 | 168.42 | 7.02 |
| a67780aba9dd436e7ac0aa1f889e73ec | 6 | 167.90 | 7.00 |
| db2956745b3a8e9f3785c99f34b5d25e | 46 | 166.93 | 6.96 |
| 1b7e5006cb25dd23ce8686691013b121 | 11 | 166.48 | 6.94 |
| b6dc74ed30f124f95227bed4e43a3bed | 11 | 166.43 | 6.93 |
| cb5df0dcb9e280b1780e9d589889f2c7 | 11 | 165.89 | 6.91 |
| cce6ab8d1682639fe45ab70234f1665f | 62 | 164.88 | 6.87 |
| a49928bcdf77c55c6d6e05e09a9b4ca5 | 98 | 164.35 | 6.85 |
| 0307f7565ff85b299b6e5ef162b1b1a9 | 6 | 163.34 | 6.81 |
| c1ed9e3daf6154542826a6996af0fb8c | 8 | 163.29 | 6.80 |
| c6381d2d013342748761e906d45aff76 | 15 | 162.97 | 6.79 |
| 712e6ed8aa4aa1fa65dab41fed5737e4 | 79 | 162.96 | 6.79 |
| e504a4e2efaa45cbff7e268a2c58c956 | 10 | 162.95 | 6.79 |
| beadbee30901a7f61d031b6b686095ad | 64 | 162.53 | 6.77 |
| 1da366cade6d8276e7d8beea7af5d4bf | 43 | 161.53 | 6.73 |
| 7fdb0720c8d7c9075538b365dc8c3a22 | 9 | 159.16 | 6.63 |
| 85d9eb9ddc5d00ca9336a2219c97bb13 | 506 | 158.41 | 6.60 |
| 4e17c65a516f69d023a2ae78b84f28d6 | 60 | 158.16 | 6.59 |
| dd7ddc04e1b6c2c614352b383efe2d36 | 121 | 157.70 | 6.57 |
| a888faf2d1baececa6baf9c3d603ee1f | 53 | 157.43 | 6.56 |
| c64a2aec32cc408a8a4c6d7c46017f91 | 9 | 157.34 | 6.56 |
| 1976a05ecf2996d6ffb7b5430e635fe7 | 8 | 156.38 | 6.52 |
| 6c177e38df6d3f34182b1f1d427231bf | 55 | 154.76 | 6.45 |
| 6a51fc556dab5f766ced6fbc860bc613 | 25 | 153.92 | 6.41 |
| d5b9a084373da994a6e37f732169a853 | 7 | 153.90 | 6.41 |
| f97e2b0650c2068227e1d366140eb62f | 16 | 153.14 | 6.38 |
| e21a1e8be4715b0fcfd5385643bc49c4 | 21 | 152.46 | 6.35 |
| 6039e27294dc75811c0d8a39069f52c0 | 64 | 152.18 | 6.34 |
| 1dcfa3835714681d4ba4a93bc1e0fd41 | 12 | 150.38 | 6.27 |
| 8378eb36b6b70c5734e83ad7b8cdc3b7 | 7 | 149.29 | 6.22 |
| de66a66f2dd06bb9ec37aa96987466a3 | 7 | 148.99 | 6.21 |
| ba6ce6de456f6c11d08c17701ba5c26f | 6 | 148.86 | 6.20 |
| df560393f3a51e74553ab94004ba5c87 | 28 | 148.54 | 6.19 |
| 7bcd7c5f8631701474db233ccf1c094b | 9 | 147.80 | 6.16 |
| ffeee66ac5d5a62fe688b9d26f83f534 | 14 | 147.67 | 6.15 |
| 33c51922b6e3c16b24a8f28e5833c99d | 18 | 145.64 | 6.07 |
| 8a87611c08849ffeeccab52aa798b6c7 | 6 | 145.53 | 6.06 |
| dfc475d54e1b6dbeeb7d7d9bdaa63827 | 18 | 144.34 | 6.01 |
| e81e1f9c3e762e620f3fb60a51499e16 | 17 | 143.37 | 5.97 |
| c24173065483e421f9e5e7f599de101d | 9 | 143.09 | 5.96 |
| 99002261c568a84cce14d43fcffb43ea | 36 | 142.65 | 5.94 |
| 5b179e9e8cc7ab6fd113a46ca584da81 | 34 | 142.05 | 5.92 |
| 71271995e85f5b8530be99ed54a91b89 | 6 | 141.72 | 5.90 |
| 8e6cc767478edae941d9bd9eb778d77a | 104 | 140.90 | 5.87 |
| c3acdfac4e3e97ff87529454fbc03642 | 12 | 140.80 | 5.87 |
| f08a5b9dd6767129688d001acafc21e5 | 11 | 140.71 | 5.86 |
| 5b0cc932433fa5184b5b94bfe6bcc256 | 12 | 140.10 | 5.84 |
| bbad7e518d7af88a0897397ffdca1979 | 68 | 139.96 | 5.83 |
| 855668e0971d4dfd7bef1b6a4133b41b | 306 | 139.38 | 5.81 |
| e06f09ec0a4aca210779cf1cfc63cf19 | 29 | 139.33 | 5.81 |
| 8a40ae794fa3f9892cee24bd5af0488b | 5 | 139.15 | 5.80 |
| 04308b1ee57b6625f47df1d56f00eedf | 93 | 139.12 | 5.80 |
| 37dd29b36c458d82f74a953c40c43645 | 7 | 138.78 | 5.78 |
| e067ad2c1c0b48758eb1b5228bcf7a68 | 54 | 138.37 | 5.77 |
| eb4df17aed01d918c65f0f8d650900c0 | 11 | 137.84 | 5.74 |
| 6973a06f484aacf400ece213dbf3d946 | 92 | 137.67 | 5.74 |
| f9903dd0f812c7e771fcad47b6fff231 | 10 | 137.38 | 5.72 |
| f9244d45189d3a3605499abddeade7d5 | 21 | 136.88 | 5.70 |
| 9dda5bbacd45e18d6485fee649205d09 | 19 | 136.88 | 5.70 |
| e8b3a3a38279a82f0e5d006d5e5b7d2c | 80 | 136.76 | 5.70 |
| d93844a9c55ba7ce353388bcf849ea56 | 12 | 136.60 | 5.69 |
| 271c58a1d139c45eaf3316107c6d3a3b | 28 | 135.53 | 5.65 |
| 236585a4ddb5fea9fa40233e2249ed91 | 11 | 134.85 | 5.62 |
| 4c8b8048e33af2bf94f2eb547746a916 | 19 | 134.26 | 5.59 |
| 25debeafbce801fdd479539350185eee | 5 | 133.99 | 5.58 |
| cac4c8e7b1ca6252d8f20b2fc1a2e4af | 74 | 132.26 | 5.51 |
| 656591be56071d4c9ef4e5fee78a578a | 7 | 132.12 | 5.50 |
| c4d51195486dc781531876a7d00453d8 | 17 | 131.29 | 5.47 |
| 4d600e08ecbe08258c79e536c5a42fee | 7 | 129.89 | 5.41 |
| 184a67a8f9f63234d3a92340bbdb727f | 8 | 129.79 | 5.41 |
| dd2bdf855a9172734fbc3744021ae9b9 | 106 | 128.78 | 5.37 |
| 63b464dbf392c7b80d12d932fa7cafed | 36 | 128.64 | 5.36 |
| cd843d4cf8ef32827de0396605163ba3 | 5 | 128.59 | 5.36 |
| 4e8dacf3d38f281ae26c3e0321d92d88 | 8 | 128.42 | 5.35 |
| 343e716476e3748b069f980efbaa294e | 17 | 127.90 | 5.33 |
| 014c0679dd340a0e338872e7ec85666a | 14 | 127.87 | 5.33 |
| e819bcfade7b5d88a27325eb6cfd62c5 | 5 | 127.71 | 5.32 |
| 6e0908ef4d4efadbc3cc2b74ea477cb0 | 8 | 127.41 | 5.31 |
| 575df70bde3f9f2b30bf8d2e9910d725 | 27 | 127.12 | 5.30 |
| 0873d9f8f36123f8d910f4760e788cfb | 6 | 126.96 | 5.29 |
| 373ee4a3a775a733770ca3f790b8b9ac | 13 | 126.89 | 5.29 |
| 062ce95fa2ad4dfaedfc79260130565f | 54 | 126.73 | 5.28 |
Insight:
- Extreme Outliers: The absolute slowest seller (
ecccfa...) takes an average of 25.9 days just to hand over the package to the carrier! However, the worst offenders at the very top of the list mostly have low order volumes (between 5 and 14 orders), suggesting they might be inactive, poorly managed, or drop-shipping from overseas. - The High-Volume Red Flag: Seller
7c67e144...is a massive problem for the platform. They fulfilled 980 orders but have an incredibly slow average dispatch time of 11.6 days. Because of their high volume, this single seller is likely generating a huge amount of late deliveries and negative customer reviews. - Unacceptable Baselines: In modern e-commerce, dispatching an item should take 1 to 2 days. The sellers on this "Bottom" list are consistently taking over a week (7+ days) just to get the product out of their warehouse. This indicates a severe supply chain or inventory management issue on the seller's side.
- Actionable Next Step: Olist needs to enforce strict Service Level Agreements (SLAs). Sellers like
7c67e144...should be penalized or warned, as they are large enough to seriously damage Olist's brand reputation with their 11-day packing delays.
Pareto Analysis for Sellers¶
17. Do the top 10% of sellers generate more than 50% of the platform's total revenue? Furthermore, which product categories are dominated by a single 'Monopoly Seller' vs. categories with a healthy, competitive mix of multiple vendors?
Solution:
-
Firstly, we calculate global seller revenue (
seller_global_revenueandglobal_paretoCTEs): We join theorder_items,payments, andorderstables to sum up the total revenue for each seller across the entire platform. We then use thentile(10)window function to rank sellers into deciles, which helps identify the platform-wide revenue concentration. -
Secondly, we calculate seller market share per category (
category_seller_shareCTE): We group the sales data by bothproduct_category_nameandseller_idto find how much revenue each seller generated in that specific category. We also use a window functionsum() over (partition by product_category_name)to calculate the total overall revenue for the entire category without grouping away the seller-level details. -
Thirdly, we identify the top seller's dominance (
category_dominanceCTE): Grouping strictly by category, we count the total number of unique vendors (total_vendors) and find the highest revenue generated by a single seller (max(seller_category_revenue)). By dividing this top seller's revenue by the total category revenue, we calculate the exact market share percentage held by the leading vendor (leader_market_share_pct). -
Finally, we classify the competitive health of each market (Final
SELECT): We use aCASEstatement to evaluate the market conditions. Categories where a single seller holds over 50% market share alongside fewer than 5 total vendors are flagged as 'Monopoly / Highly Concentrated', while categories with a leading share under 15% and over 20 vendors are labeled as 'Healthy / Fragmented Competition'.
Show SQL Code
with seller_global_revenue as (
-- step 1: calculate total revenue for every seller across the platform
select
oi.seller_id,
sum(p.payment_value) as total_seller_revenue
from
order_items oi
inner join
payments p on oi.order_id = p.order_id
inner join
orders o on oi.order_id = o.order_id
where
o.order_status not in ('canceled', 'unavailable')
group by
oi.seller_id
),
global_pareto as (
-- step 2: rank sellers into deciles to check platform-wide concentration
select
seller_id,
total_seller_revenue,
ntile(10) over (order by total_seller_revenue desc) as seller_decile,
sum(total_seller_revenue) over () as platform_total_revenue
from
seller_global_revenue
),
category_seller_share as (
-- step 3: calculate revenue share for each seller within their specific category
select
prod.product_category_name,
oi.seller_id,
sum(p.payment_value) as seller_category_revenue,
sum(sum(p.payment_value)) over (partition by prod.product_category_name) as total_category_revenue
from
order_items oi
inner join
products prod on oi.product_id = prod.product_id
inner join
payments p on oi.order_id = p.order_id
inner join
orders o on oi.order_id = o.order_id
where
o.order_status not in ('canceled', 'unavailable')
and prod.product_category_name is not null
group by
prod.product_category_name, oi.seller_id
),
category_dominance as (
-- step 4: identify the share of the 'top seller' in each category
select
product_category_name,
count(distinct seller_id) as total_vendors,
max(seller_category_revenue) as top_seller_revenue,
total_category_revenue,
-- market share of the leading seller
cast(100.0 * max(seller_category_revenue) / total_category_revenue as decimal(5,2)) as leader_market_share_pct
from
category_seller_share
group by
product_category_name, total_category_revenue
)
-- final output: categorizing product categories by competitive health
select
product_category_name,
total_vendors,
leader_market_share_pct,
case
when leader_market_share_pct >= 50 and total_vendors < 5 then 'Monopoly / Highly Concentrated'
when leader_market_share_pct >= 30 then 'Dominant Leader'
when leader_market_share_pct < 15 and total_vendors > 20 then 'Healthy / Fragmented Competition'
else 'Moderate Competition'
end as market_health_status
from
category_dominance
order by
leader_market_share_pct desc;
Results:
| product_category_name | total_vendors | leader_market_share_pct | market_health_status |
|---|---|---|---|
| cds_dvds_musicais | 1 | 100.00 | Monopoly / Highly Concentrated |
| la_cuisine | 2 | 94.95 | Monopoly / Highly Concentrated |
| portateis_cozinha_e_preparadores_de_alimentos | 5 | 74.34 | Dominant Leader |
| moveis_escritorio | 34 | 72.74 | Dominant Leader |
| casa_conforto_2 | 4 | 71.76 | Monopoly / Highly Concentrated |
| telefonia_fixa | 49 | 70.72 | Dominant Leader |
| tablets_impressao_imagem | 6 | 68.33 | Dominant Leader |
| fraldas_higiene | 7 | 67.53 | Dominant Leader |
| pcs | 9 | 67.33 | Dominant Leader |
| seguros_e_servicos | 2 | 64.42 | Monopoly / Highly Concentrated |
| pc_gamer | 2 | 59.92 | Monopoly / Highly Concentrated |
| flores | 3 | 51.52 | Monopoly / Highly Concentrated |
| portateis_casa_forno_e_cafe | 15 | 50.67 | Dominant Leader |
| moveis_colchao_e_estofado | 5 | 46.49 | Dominant Leader |
| eletrodomesticos | 48 | 46.29 | Dominant Leader |
| casa_conforto | 20 | 46.09 | Dominant Leader |
| livros_importados | 7 | 45.89 | Dominant Leader |
| artes | 35 | 43.33 | Dominant Leader |
| bebidas | 36 | 43.13 | Dominant Leader |
| fashion_roupa_feminina | 11 | 42.24 | Dominant Leader |
| artigos_de_festas | 12 | 41.25 | Dominant Leader |
| fashion_calcados | 13 | 41.23 | Dominant Leader |
| artigos_de_natal | 17 | 40.30 | Dominant Leader |
| fashion_underwear_e_moda_praia | 9 | 39.90 | Dominant Leader |
| dvds_blu_ray | 10 | 36.88 | Dominant Leader |
| fashion_roupa_infanto_juvenil | 4 | 34.49 | Dominant Leader |
| agro_industria_e_comercio | 34 | 34.30 | Dominant Leader |
| audio | 36 | 34.18 | Dominant Leader |
| papelaria | 173 | 34.14 | Dominant Leader |
| industria_comercio_e_negocios | 29 | 33.68 | Dominant Leader |
| eletrodomesticos_2 | 46 | 32.97 | Dominant Leader |
| ferramentas_jardim | 235 | 32.57 | Dominant Leader |
| artes_e_artesanato | 13 | 32.06 | Dominant Leader |
| livros_interesse_geral | 53 | 31.98 | Dominant Leader |
| construcao_ferramentas_iluminacao | 36 | 31.82 | Dominant Leader |
| construcao_ferramentas_ferramentas | 24 | 30.68 | Dominant Leader |
| fashion_esporte | 10 | 30.27 | Dominant Leader |
| moveis_sala | 55 | 30.00 | Dominant Leader |
| construcao_ferramentas_jardim | 32 | 29.79 | Moderate Competition |
| construcao_ferramentas_seguranca | 34 | 29.18 | Moderate Competition |
| fashion_roupa_masculina | 13 | 28.79 | Moderate Competition |
| sinalizacao_e_seguranca | 46 | 28.41 | Moderate Competition |
| livros_tecnicos | 26 | 27.78 | Moderate Competition |
| moveis_quarto | 23 | 23.77 | Moderate Competition |
| malas_acessorios | 73 | 23.52 | Moderate Competition |
| alimentos_bebidas | 33 | 22.65 | Moderate Competition |
| consoles_games | 81 | 22.19 | Moderate Competition |
| 256 | 22.11 | Moderate Competition | |
| cool_stuff | 263 | 21.04 | Moderate Competition |
| musica | 19 | 20.20 | Moderate Competition |
| instrumentos_musicais | 68 | 20.18 | Moderate Competition |
| telefonia | 146 | 19.99 | Moderate Competition |
| market_place | 59 | 19.92 | Moderate Competition |
| cine_foto | 14 | 19.66 | Moderate Competition |
| climatizacao | 52 | 19.63 | Moderate Competition |
| moveis_decoracao | 366 | 19.03 | Moderate Competition |
| casa_construcao | 68 | 18.41 | Moderate Competition |
| alimentos | 58 | 18.17 | Moderate Competition |
| eletroportateis | 101 | 16.91 | Moderate Competition |
| perfumaria | 174 | 16.69 | Moderate Competition |
| relogios_presentes | 98 | 16.31 | Moderate Competition |
| brinquedos | 249 | 15.49 | Moderate Competition |
| cama_mesa_banho | 195 | 15.05 | Moderate Competition |
| bebes | 242 | 14.60 | Healthy / Fragmented Competition |
| construcao_ferramentas_construcao | 127 | 14.08 | Healthy / Fragmented Competition |
| eletronicos | 146 | 12.82 | Healthy / Fragmented Competition |
| moveis_cozinha_area_de_servico_jantar_e_jardim | 51 | 12.82 | Healthy / Fragmented Competition |
| fashion_bolsas_e_acessorios | 99 | 11.58 | Healthy / Fragmented Competition |
| automotivo | 374 | 11.09 | Healthy / Fragmented Competition |
| informatica_acessorios | 285 | 10.01 | Healthy / Fragmented Competition |
| pet_shop | 135 | 7.22 | Healthy / Fragmented Competition |
| utilidades_domesticas | 463 | 7.10 | Healthy / Fragmented Competition |
| esporte_lazer | 477 | 5.27 | Healthy / Fragmented Competition |
| beleza_saude | 489 | 5.22 | Healthy / Fragmented Competition |
Insights:
- True Monopolies in Niche Markets: Very small categories like cds_dvds_musicais and la_cuisine are completely run by 1 or 2 sellers holding 95% to 100% of the market share. This is expected for low-demand or highly specialized items.
- The Illusion of Competition: This is a major finding. Categories like moveis_escritorio (office furniture) look competitive on paper because they have 34 vendors. However, one single seller controls almost 73% of the market. We see the exact same thing in telefonia_fixa (landline phones), where the top seller holds 70% despite having 48 competitors.
- Healthy Core Markets: The best news for Olist is that its top-earning categories are heavily fragmented. Beleza_saude (health & beauty) and esporte_lazer (sports & leisure) each have nearly 500 active vendors, and the top seller only controls about 5% of the sales. This means the platform's core revenue is highly secure.
- Single Point of Failure (Risk): Any category tagged as "Dominant Leader" carries a huge supply chain risk. If the top seller in pcs (67% share) or moveis_escritorio leaves the platform or starts delivering late, Olist's revenue and reputation in that entire category will instantly tank.
- The Pareto Principle in Action: While the table focuses on category breakdowns rather than the overall platform revenue, it perfectly illustrates the 80/20 rule. Across dozens of categories, a single vendor is doing the heavy lifting and generating the vast majority of the sales volume.
Seller Churn Analysis¶
18. What is the churn rate of sellers on the platform? Specifically, how many sellers who made a sale in 2017 became inactive (zero sales) in 2018? Is there a correlation between high 'Order Cancellation Rates' and subsequent seller churn?
Solution:
-
Firstly, we aggregate annual activity per seller (
seller_activityCTE): We join theordersandorder_itemstables to calculate each seller'stotal_ordersandcanceled_ordersgrouped by the specific year they occurred (extracted from the timestamp using theyear()function). -
Secondly, we build the 2017 cohort and identify churn (
seller_cohort_2017CTE): We isolate sellers who were active in 2017 and perform aLEFT JOINagainst their own 2018 activity data. If the 2018 record is missing (s18.seller_id is null), we classify them as 'Churned (Inactive in 2018)', otherwise they are 'Retained'. We also compute their individual 2017 cancellation rate percentage here. -
Finally, we aggregate metrics by churn status (Final
SELECT): We group the cohort by theirchurn_statusto count the exact number of sellers that stayed versus left. By averaging thecancellation_rate_2017and theorders_2017volume for both groups, we can directly observe the correlation between poor fulfillment performance (high cancellations) and subsequent seller churn on the platform.
Show SQL Code
with seller_activity as (
-- step 1: identify activity years and cancellation counts for every seller
select
oi.seller_id,
year(o.order_purchase_timestamp) as activity_year,
count(distinct o.order_id) as total_orders,
sum(case when o.order_status = 'canceled' then 1 else 0 end) as canceled_orders
from
orders o
inner join
order_items oi on o.order_id = oi.order_id
group by
oi.seller_id, year(o.order_purchase_timestamp)
),
seller_cohort_2017 as (
-- step 2: isolate sellers who were active in 2017 and track their 2018 status
select
s17.seller_id,
s17.total_orders as orders_2017,
cast(100.0 * s17.canceled_orders / s17.total_orders as decimal(5,2)) as cancellation_rate_2017,
case
when s18.seller_id is null then 'Churned (Inactive in 2018)'
else 'Retained (Active in 2018)'
end as churn_status
from
seller_activity s17
left join
seller_activity s18 on s17.seller_id = s18.seller_id and s18.activity_year = 2018
where
s17.activity_year = 2017
)
-- final output: comparing cancellation rates across churned vs retained sellers
select
churn_status,
count(seller_id) as seller_count,
-- average cancellation rate to identify the correlation
cast(avg(cancellation_rate_2017) as decimal(5,2)) as avg_cancellation_rate_pct,
-- average volume to see if smaller or larger sellers churn more
cast(avg(orders_2017 * 1.0) as decimal(10,2)) as avg_2017_order_volume
from
seller_cohort_2017
group by
churn_status;
Results:
| churn_status | seller_count | avg_cancellation_rate_pct | avg_2017_order_volume |
|---|---|---|---|
| Churned (Inactive in 2018) | 685 | 6.06 | 6.34 |
| Retained (Active in 2018) | 1099 | 0.48 | 37.07 |
Insights:
- Massive Drop-Off Rate: A huge chunk of the seller base did not survive into 2018. Out of 1,784 sellers who made sales in 2017, 685 of them (around 38%) completely stopped operating the following year.
- Low Traction Means High Flight Risk: The sellers who churned never really gained momentum. They averaged only 6.34 orders for the entirety of 2017. In contrast, the retained sellers were moving roughly 37 orders a year. If a vendor doesn't see quick sales, they abandon the platform.
- The Cancellation Red Flag: There is a massive, direct correlation between failing to fulfill orders and leaving the platform. The churned sellers had an average cancellation rate of 6.06%, which is over 12 times worse than the healthy, retained sellers (who kept cancellations under 0.5%).
- Profile of a Failing Seller: The data paints a very clear picture. The typical seller who leaves Olist is a low-volume vendor who struggles to actually deliver the few orders they get. This poor performance likely leads to account suspension or them simply giving up.
Identifying Underperforming Sellers¶
19. Which sellers have a high sales volume (>50 orders) but a consistently low average review score. Can we identify specific sellers who are responsible for a disproportionate number of the platforms 1-star reviews?
Solution:
-
Firstly, we analyze the seller churn correlation (Part 1): We aggregate annual activity and cancellation rates per seller (
seller_activityCTE), then isolate the 2017 cohort to track whether they were retained or churned in 2018 (seller_cohort_2017CTE). The first output aggregates the average cancellation rates and order volumes for churned versus retained sellers. -
Secondly, we aggregate review statistics per seller (Part 2 -
seller_review_statsCTE): To evaluate seller quality, we join theorder_items,orders, andorder_reviewstables. Grouping strictly byseller_id, we calculate theirtotal_orders, theiravg_review_score, and calculate their specificone_star_count. -
Thirdly, we calculate the platform-wide baseline (Part 2 -
platform_total_one_starsCTE): We independently calculate thegrand_total_one_starsacross the entire database. This acts as the denominator for measuring an individual seller's negative impact against the whole platform. -
Finally, we identify the worst offenders and their platform impact (Part 2 Final
SELECT): WeCROSS JOINthe seller stats with the platform baseline. We filter for high-volume sellers (total_orders > 50) with poor ratings (avg_review_score < 3). By computingpct_of_platform_one_stars, we can pinpoint exactly how much of the platform's total negative feedback is disproportionately driven by these specific vendors.
Show SQL Code
with seller_activity as (
-- step 1: identify activity years and cancellation counts for every seller
select
oi.seller_id,
year(o.order_purchase_timestamp) as activity_year,
count(distinct o.order_id) as total_orders,
sum(case when o.order_status = 'canceled' then 1 else 0 end) as canceled_orders
from
orders o
inner join
order_items oi on o.order_id = oi.order_id
group by
oi.seller_id, year(o.order_purchase_timestamp)
),
seller_cohort_2017 as (
-- step 2: isolate sellers who were active in 2017 and track their 2018 status
select
s17.seller_id,
s17.total_orders as orders_2017,
cast(100.0 * s17.canceled_orders / s17.total_orders as decimal(5,2)) as cancellation_rate_2017,
case
when s18.seller_id is null then 'Churned (Inactive in 2018)'
else 'Retained (Active in 2018)'
end as churn_status
from
seller_activity s17
left join
seller_activity s18 on s17.seller_id = s18.seller_id and s18.activity_year = 2018
where
s17.activity_year = 2017
)
-- output 1: comparing cancellation rates across churned vs retained sellers
select
churn_status,
count(seller_id) as seller_count,
cast(avg(cancellation_rate_2017) as decimal(5,2)) as avg_cancellation_rate_pct,
cast(avg(orders_2017 * 1.0) as decimal(10,2)) as avg_2017_order_volume
from
seller_cohort_2017
group by
churn_status;
/* PART 2: Seller Review Performance & 1-Star Concentration
Objective:
1. Identify high-volume sellers (>50 orders) with poor ratings (<3 stars).
2. Calculate their contribution to the platform's total 1-star reviews.
*/
with seller_review_stats as (
-- step 1: aggregate review scores and total 1-stars per seller
select
oi.seller_id,
count(distinct o.order_id) as total_orders,
avg(orv.review_score * 1.0) as avg_review_score,
sum(case when orv.review_score = 1 then 1 else 0 end) as one_star_count
from
order_items oi
inner join
orders o on oi.order_id = o.order_id
inner join
order_reviews orv on o.order_id = orv.order_id
group by
oi.seller_id
),
platform_total_one_stars as (
-- step 2: get the total number of 1-star reviews across the whole platform
select sum(case when review_score = 1 then 1 else 0 end) as grand_total_one_stars
from order_reviews
)
-- output 2: flagging high-risk sellers based on volume and negative sentiment
select
s.seller_id,
s.total_orders,
cast(s.avg_review_score as decimal(3,2)) as avg_review_score,
s.one_star_count,
-- concentration: what % of the entire platform's 1-star reviews come from this one seller?
cast(100.0 * s.one_star_count / p.grand_total_one_stars as decimal(5,2)) as pct_of_platform_one_stars
from
seller_review_stats s
cross join
platform_total_one_stars p
where
s.total_orders > 50
and s.avg_review_score < 3
order by
s.one_star_count desc;
Results 1:
| churn_status | seller_count | avg_cancellation_rate_pct | avg_2017_order_volume |
|---|---|---|---|
| Churned (Inactive in 2018) | 685 | 6.06 | 6.34 |
| Retained (Active in 2018) | 1099 | 0.48 | 37.07 |
Results 2:
| seller_id | total_orders | avg_review_score | one_star_count | pct_of_platform_one_stars |
|---|---|---|---|---|
| 2eb70248d66e0e3ef83659f71b244378 | 198 | 2.71 | 81 | 0.71 |
| 1ca7077d890b907f89be8c954a02686a | 114 | 2.20 | 80 | 0.70 |
| a49928bcdf77c55c6d6e05e09a9b4ca5 | 98 | 2.95 | 37 | 0.32 |
| 54965bbe3e4f07ae045b90b0b8541f52 | 74 | 2.94 | 34 | 0.30 |
| 972d0f9cf61b499a4812cf0bfa3ad3c4 | 79 | 2.96 | 29 | 0.25 |
Insights:
- The Worst Offenders: We isolated five specific sellers who process a solid volume of orders (between 74 and 198) but consistently fail their customers, keeping their average rating below 3 stars.
- Disproportionate Damage: The top two sellers on this list (
2eb702...and1ca707...) are incredibly toxic to the platform. Together, these two vendors are responsible for 1.41% of all 1-star reviews generated across the entire Olist marketplace. - High Volume, High Failure Rate: Seller
1ca7077d...is the most concerning data point. They fulfilled 114 orders and racked up 80 one-star reviews. This suggests that a massive majority of their customers had a terrible experience, dragging their average score down to an abysmal 2.20. - Reputation Risk at Scale: Sellers hovering around the 2.9 score (
a499...,5496...,972d...) might not look as disastrous at first glance, but because they process 70 to 100 orders, they are consistently delivering bad experiences to a large group of people. - Actionable Next Step: These vendors are actively hurting the Olist brand. The platform should immediately suspend or heavily penalize these top 5 sellers to protect future buyers.
Conclusion¶
What We Did: * Calculated the average dispatch times for sellers to identify the worst bottlenecks in the fulfillment process. * Performed a Pareto analysis on market share to see which categories are dominated by monopolies versus those with healthy competition. * Analyzed seller churn rates between 2017 and 2018, comparing the performance metrics of retained versus inactive vendors. * Isolated high-volume, low-rated sellers to measure their specific impact on the platform's overall 1-star review count.
What We Obtained (Key Findings): * Severe Dispatch Delays: While some sellers operate efficiently, the bottom tier is completely failing. We identified high-volume vendors taking over 11 days just to pack an item, which creates a massive bottleneck. * The "Dominant Leader" Risk: Many categories look competitive but are actually carried by a single vendor holding up to 70% of the market. This creates a huge supply chain risk if that seller leaves. Luckily, our biggest revenue categories (like health & beauty) are highly fragmented and safe. * Failure Drives Churn: About 38% of 2017 sellers quit in 2018. The data shows they didn't just leave; they failed out. Churned sellers averaged only 6 orders a year and had a cancellation rate 12 times higher than successful, active sellers. * A Few Bad Apples: We pinpointed a tiny group of sellers who are actively damaging the platform's reputation. Just two specific vendors generated over 1.4% of all 1-star reviews on Olist, proving that strict seller quality enforcement is urgently needed.