Payments and Risk Analysis¶
Payment Distribution and Risk Analysis¶
20. What is the distribution of total revenue across different payment methods (Credit Card, Boleto, Voucher, Debit Card)? Specifically, does the 'Average Transaction Value' differ significantly between Credit Card users and Boleto users?
Solution:
Solution Steps for Payment Method Distribution and ATV
-
Firstly, we aggregate metrics by payment type (
payment_distributionCTE): We query thepaymentstable and group bypayment_typeto calculate the totaltransaction_count, thetotal_revenue(sum(payment_value)), and the average transaction value (avg(payment_value)). -
Secondly, we calculate the platform-wide revenue baseline (
global_metricsCTE): We aggregate the sum of all revenue from the previous CTE to establish thegrand_total_revenue. This gives us the necessary denominator for calculating market share percentages. -
Finally, we compare payment methods and value differences (Final
SELECT): WeCROSS JOINthe aggregated payment data with the global baseline. By dividing the payment type's revenue by the grand total, we compute therevenue_share_pct. Additionally, we format theavg_transaction_value(ATV) and use aCASEstatement to classify the transaction profiles, specifically highlighting the differing use cases of 'Credit Card' (High Convenience / Installment Potential) versus 'Boleto' (Cash-based / Single Payment).
Show SQL Code
with payment_distribution as (
-- step 1: aggregate metrics at the payment type level
select
payment_type,
count(*) as transaction_count,
sum(payment_value) as total_revenue,
avg(payment_value) as avg_transaction_value
from
payments
group by
payment_type
),
global_metrics as (
-- step 2: get platform-wide totals for percentage calculations
select
sum(total_revenue) as grand_total_revenue
from
payment_distribution
)
-- final output: comparing payment methods and identifying value differences
select
pd.payment_type,
pd.transaction_count,
cast(pd.total_revenue as decimal(15,2)) as revenue_contribution,
-- revenue share percentage
cast(100.0 * pd.total_revenue / gm.grand_total_revenue as decimal(5,2)) as revenue_share_pct,
-- average transaction value (atv)
cast(pd.avg_transaction_value as decimal(10,2)) as atv,
-- comparison logic specifically for credit card vs boleto
case
when pd.payment_type = 'credit_card' then 'High Convenience / Installment Potential'
when pd.payment_type = 'boleto' then 'Cash-based / Single Payment'
else 'Alternative Method'
end as payment_profile
from
payment_distribution pd
cross join
global_metrics gm
order by
revenue_contribution desc;
Results:
| payment_type | transaction_count | revenue_contribution | revenue_share_pct | atv | payment_profile |
|---|---|---|---|---|---|
| credit_card | 76795 | 12542084.19 | 78.34 | 163.32 | High Convenience / Installment Potential |
| boleto | 19784 | 2869361.27 | 17.92 | 145.03 | Cash-based / Single Payment |
| voucher | 5775 | 379436.87 | 2.37 | 65.70 | Alternative Method |
| debit_card | 1529 | 217989.79 | 1.36 | 142.57 | Alternative Method |
| not_defined | 3 | 0.00 | 0.00 | 0.00 | Alternative Method |
Insights:
- Credit Cards Dominate the Market: Credit cards are by far the most important payment method for Olist, bringing in an overwhelming 78.3% of the total revenue ($12.5M).
- Boleto is Still Essential: The Brazilian cash-based payment method, Boleto, is the second largest driver, securing almost 18% of the revenue. This shows that a solid portion of our customer base relies on cash or doesn't use traditional credit systems.
- Credit Drives Bigger Baskets: There is a clear gap in the Average Transaction Value (ATV). Credit card users spend an average of $163.32, compared to $145.03 for Boleto users. Customers are obviously more willing to make high-ticket purchases when they can take advantage of credit card installments.
- Vouchers Mean Small Spends: Vouchers have the lowest ATV at just $65.70. This makes sense, as they are typically used as gift cards, promotional discounts, or partial payments for smaller items rather than for massive purchases.
- Debit Cards are Rare: Surprisingly, standard debit cards are barely used on the platform, accounting for just 1.36% of total revenue. Customers overwhelmingly prefer the installment options of a credit card or the cash accessibility of a Boleto.
Installment Correlation and High-Ticket Analysis¶
21. Is there a positive correlation between the number of installments chosen (payment_installments) and the total order value? Specifically, what percentage of high-ticket orders (>R$500) are purchased using 5+ installments?"
Solution:
Solution Steps for Payment Method and Installment Analysis
-
Firstly, we compare payment method performance (Part 1): We aggregate the
transaction_count,total_revenue, andavg_transaction_valuegrouped bypayment_type. By cross-joining this with a global revenue total, we calculate therevenue_share_pct, allowing us to compare the financial weight of credit cards versus cash-based methods like Boleto. -
Secondly, we correlate installment counts with order value (Part 2 -
installment_statsCTE): We isolate 'credit_card' transactions—since they are the primary source of installments—and group them by the number ofpayment_installments. We calculate the average payment value for each count to see if customers purchasing expensive items tend to choose higher installment numbers. -
Thirdly, we isolate the high-ticket segment (Part 2 -
high_ticket_segmentCTE): We filter thepaymentstable for orders with a value greater than R$500. We then count the total number of these orders and use a conditional sum to identify how many of them utilized 5 or more installments, effectively identifying "financing" behavior. -
Finally, we calculate the financing penetration (Part 2 Final
SELECT): We compute thehigh_ticket_financing_pctby dividing our high-installment count by the total high-ticket count. To provide a clear correlation proof, we also use subqueries to fetch and compare the average values of 1-installment versus 10-installment purchases, highlighting the trend toward higher installment counts for more expensive goods.
Show SQL Code
with payment_distribution as (
-- step 1: aggregate metrics at the payment type level
select
payment_type,
count(*) as transaction_count,
sum(payment_value) as total_revenue,
avg(payment_value) as avg_transaction_value
from
payments
group by
payment_type
),
global_metrics as (
-- step 2: get platform-wide totals for percentage calculations
select
sum(total_revenue) as grand_total_revenue
from
payment_distribution
)
-- output 1: comparing payment methods and identifying value differences
select
pd.payment_type,
pd.transaction_count,
cast(pd.total_revenue as decimal(15,2)) as revenue_contribution,
cast(100.0 * pd.total_revenue / gm.grand_total_revenue as decimal(5,2)) as revenue_share_pct,
cast(pd.avg_transaction_value as decimal(10,2)) as atv
from
payment_distribution pd
cross join
global_metrics gm
order by
revenue_contribution desc;
/* PART 2: Installment Correlation & High-Ticket Analysis
Objective:
1. Analyze if higher installment counts correlate with higher order values.
2. Calculate the % of high-ticket orders (>R$500) using 5+ installments.
*/
with installment_stats as (
-- step 1: calculate average value per installment count
select
payment_installments,
count(*) as order_count,
avg(payment_value) as avg_payment_value
from
payments
where
payment_type = 'credit_card' -- installments primarily apply to credit cards
and payment_installments > 0
group by
payment_installments
),
high_ticket_segment as (
-- step 2: isolate high-ticket orders and check installment counts
select
count(*) as total_high_ticket_orders,
sum(case when payment_installments >= 5 then 1 else 0 end) as high_installment_count
from
payments
where
payment_value > 500
and payment_type = 'credit_card'
)
-- output 2: identifying the "financing" behavior of high-value customers
select
h.total_high_ticket_orders,
h.high_installment_count,
-- percentage of high-ticket users who "finance" via 5+ installments
cast(100.0 * h.high_installment_count / h.total_high_ticket_orders as decimal(5,2)) as high_ticket_financing_pct,
-- referencing the average value for 1 installment vs 10 installments for context
(select cast(avg_payment_value as decimal(10,2)) from installment_stats where payment_installments = 1) as avg_val_1_inst,
(select cast(avg_payment_value as decimal(10,2)) from installment_stats where payment_installments = 10) as avg_val_10_inst
from
high_ticket_segment h;
Show SQL Code
/* PART 1: Payment Method & Transaction Value Analysis
Objective:
1. Analyze the distribution of revenue across payment types.
2. Compare average transaction value (atv) between credit card and boleto users.
*/
with payment_distribution as (
-- step 1: aggregate metrics at the payment type level
select
payment_type,
count(*) as transaction_count,
sum(payment_value) as total_revenue,
avg(payment_value) as avg_transaction_value
from
payments
group by
payment_type
),
global_metrics as (
-- step 2: get platform-wide totals for percentage calculations
select
sum(total_revenue) as grand_total_revenue
from
payment_distribution
)
-- output 1: comparing payment methods and identifying value differences
select
pd.payment_type,
pd.transaction_count,
cast(pd.total_revenue as decimal(15,2)) as revenue_contribution,
cast(100.0 * pd.total_revenue / gm.grand_total_revenue as decimal(5,2)) as revenue_share_pct,
cast(pd.avg_transaction_value as decimal(10,2)) as atv
from
payment_distribution pd
cross join
global_metrics gm
order by
revenue_contribution desc;
/* PART 2: Installment Correlation & High-Ticket Analysis
Objective:
1. Analyze if higher installment counts correlate with higher order values.
2. Calculate the % of high-ticket orders (>R$500) using 5+ installments.
*/
with installment_stats as (
-- step 1: calculate average value per installment count
select
payment_installments,
count(*) as order_count,
avg(payment_value) as avg_payment_value
from
payments
where
payment_type = 'credit_card' -- installments primarily apply to credit cards
and payment_installments > 0
group by
payment_installments
),
high_ticket_segment as (
-- step 2: isolate high-ticket orders and check installment counts
select
count(*) as total_high_ticket_orders,
sum(case when payment_installments >= 5 then 1 else 0 end) as high_installment_count
from
payments
where
payment_value > 500
and payment_type = 'credit_card'
)
-- output 2: identifying the "financing" behavior of high-value customers
select
h.total_high_ticket_orders,
h.high_installment_count,
-- percentage of high-ticket users who "finance" via 5+ installments
cast(100.0 * h.high_installment_count / h.total_high_ticket_orders as decimal(5,2)) as high_ticket_financing_pct,
-- referencing the average value for 1 installment vs 10 installments for context
(select cast(avg_payment_value as decimal(10,2)) from installment_stats where payment_installments = 1) as avg_val_1_inst,
(select cast(avg_payment_value as decimal(10,2)) from installment_stats where payment_installments = 10) as avg_val_10_inst
from
high_ticket_segment h;
/* PART 3: Regional Payment Preferences & Boleto Penetration
Objective:
1. Determine how payment preferences vary by customer state.
2. Identify states where Boleto usage is significantly higher than the national average.
*/
with state_payment_counts as (
-- step 1: count payment types per state
select
c.customer_state,
count(p.payment_type) as total_payments,
sum(case when p.payment_type = 'boleto' then 1 else 0 end) as boleto_count,
sum(case when p.payment_type = 'credit_card' then 1 else 0 end) as credit_card_count
from
payments p
inner join
orders o on p.order_id = o.order_id
inner join
customers c on o.customer_id = c.customer_id
group by
c.customer_state
),
state_boleto_share as (
-- step 2: calculate state-level boleto share and the national benchmark
select
customer_state,
total_payments,
cast(100.0 * boleto_count / total_payments as decimal(5,2)) as state_boleto_pct,
-- national benchmark using window function
avg(100.0 * boleto_count / total_payments) over () as national_avg_boleto_pct
from
state_payment_counts
)
-- output 3: identifying regional outliers in payment behavior
select
customer_state,
total_payments,
state_boleto_pct,
cast(national_avg_boleto_pct as decimal(5,2)) as national_avg_pct,
-- deviation from national norm
cast(state_boleto_pct - national_avg_boleto_pct as decimal(5,2)) as percentage_point_diff,
case
when state_boleto_pct > national_avg_boleto_pct + 5 then 'High Boleto Usage (Low Credit Penetration)'
when state_boleto_pct < national_avg_boleto_pct - 5 then 'High Credit Usage (Digital/Banked Hub)'
else 'Standard Regional Mix'
end as regional_payment_profile
from
state_boleto_share
order by
state_boleto_pct desc;