Reserve Pricing in Repeated Second-Price Auctions with Strategic - - PowerPoint PPT Presentation
Reserve Pricing in Repeated Second-Price Auctions with Strategic - - PowerPoint PPT Presentation
Reserve Pricing in Repeated Second-Price Auctions with Strategic Bidders Alexey Drutsa Setup Second-Price (SP) Auction with Reserve Prices Setting A good (e.g., an ad space) is offered for sale by a seller to buyers Each buyer
Reserve Pricing in Repeated Second-Price Auctions with Strategic Bidders
Alexey Drutsa
Setup
Second-Price (SP) Auction with Reserve Prices
β Setting
βΊ A good (e.g., an ad space) is offered for sale by a seller to π buyers βΊ Each buyer π holds a private valuation π€$ β [0,1] for this good
(π€$ is unknown to the seller)
β Actions
βΊ The seller selects a reserve price π$ for each buyer π βΊ Each buyer π submits a bid π$
β Allocation and payments
βΊ Determine actual buyer-participants: π = {π β£ π$ β₯ π$} βΊ The good is received by the buyer π
4 = argmax$βπ π$ (that has the highest bid)
βΊ This buyer pays π$
4 = max {π$ 4, max$βπ β{$ 4}π$}Repeated Second-Price Auctions with Reserve
Equal goods (e.g., ad spaces) are repeatedly offered for sale
βΊ by a seller (e.g., RTB platform) to π buyers (e.g., advertisers) βΊ over π rounds (one good per round).
Each buyer π
βΊ holds a private fixed valuation π€$ β [0,1] for each of those goods, βΊ π€$ is unknown to the seller.
At each round π’ = 1, β¦ , π, the seller conducts SP auction with reserves:
βΊ the seller selects a reserve price π>
$ for each buyer πβΊ and a bid π>
$ is submitted by each buyer π.Sellerβs pricing algorithm
βΊ The seller applies a pricing algorithm π΅ that sets reserve prices {π>
$}>@A,$@A B,Cin response to bids π = {π>
$}>@A,$@A B,C- f buyers π = 1, β¦ , π
βΊ A price π>
$ can depend only on past bids {πE F}E@A,F@A >GA,Cand the horizon π.
Strategic buyers
β The seller announces her pricing algorithm π΅ in advance
In each round π’, each buyer π
βΊ observes a history of previous rounds (available to this buyer) and βΊ chooses his bid π>
$ s.t. it maximizes his future πΏ$-discounted surplus:Sur> π΅, π€$, πΏ$, {πE
$} : = π½ MπΏ$
EGAπ $@$ 4 O (π€$ β πE $) B E@>, πΏ$ β 0,1 , where π $@$
4 O is the indicator of the event when buyer π is the winner in round π‘πE
$ is the payment of the buyer π in this caseSellerβs goal
The sellerβs strategic regret: SReg π, π΅, π€$ $, πΏ$ $ : = β (max
$ π€$ β π π WXβ π> $ 4 W) B >@AShe seeks for a no-regret pricing for worst-case valuation: sup\],β¦,\^β _,A SReg π, π΅, π€$ $, πΏ$ $ = π π Optimality: the lowest possible upper bound for the regret of the form π π(π) .
Background, Research question & Main contribution
Background: 1-buyer case (posted-price auctions)
[Kleinberg et al., FOCSβ2003] Optimal algorithm against myopic buyer with truthful regret Ξ(log log π). [Drutsa, WWWβ2017] Optimal algorithm against strategic buyer with regret Ξ(log log π) for πΏ < 1. [Amin et al., NIPSβ2013] The strategic setting is introduced. β no-regret pricing for non-discount case πΏ = 1. If one buyer (π = 1), a SP auction reduces to a posted-price auction:
βΊ the buyer either accepts or rejects a currently offered price π>
AβΊ the seller either gets payment equal to π>
A or nothingResearch question
The known optimal algorithms (PRRFES & prePRRFES) from posted-price auctions cannot be directly applied to set reserve prices in second-price auctions
βΊ buyers in SP auctions have incomplete information due to presence of rivals βΊ the proofs of optimality of [pre]PRRFES strongly rely on complete information
β In this study, I try to find an optimal algorithm for the multi-buyer setup
A novel algorithm for our strategic buyers with regret upper bound of Ξ(log log π) for πΏ < 1 Main contribution
A novel transformation that maps any pricing algorithm designed for posted-price auctions to a multi-buyer setup
Main ideas
Two learning processes
SReg π, π΅, π€$ $, πΏ$ $ : = β (max
$ π€$ β π π WXβ π> $ 4 W) B >@AFind the buyersβ valuations Learning process #1 Find which buyer has the maximal valuation Learning process #2
Learning proc.#1: an idea to localize a valuation
PRRFES is an optimal learner of a valuation in posted-price auctions. However, its core localization technique relies on:
β The buyer completely knows the outcomes of current and all future rounds β given their bids (due to absence of rivals)
Can we use PRRFES in the second-price scenario where each buyer does not know perfectly the outcomes of rounds?
Barrage pricing
βΊ Reserve prices are personal (individual) in our setup βΊ Thus, we are able to βeliminateβ particular buyers from particular rounds βΊ Namely, a buyer π will not bid above 1/(1 β πΏ$) βΊ We call this price as βbarrageβ one and denote it by β
Let βeliminateβ all buyers except some buyer π in a round π’ Then the buyer π will have com
complete i ete inf nfor
- rmati
tion
- n abo
about outcome of this s ro round π’
Learning proc.#2: an idea to find max valuation
The search algorithm works by maintaining a feasible interval [π£$, π₯$] that
βΊ is aimed to localize the valuation π€$, i.e. π€$ β [π£$, π₯$] βΊ shrinks as π’ β β
β If, in a round π’, it becomes that π₯$ < π£m for some buyers π and π, β then buyer π has non-maximal valuation which should not be searched anymore
π€o π€p π€A 1
round π’A round π’p round π’o[π£A, π₯A] [π£p, π₯p] [π£o, π₯o]
Dividing algorithms
Key instrument that implements the ideas
transformation
di div
Transformation di
div: cyclic elimination
Let π΅ be an algorithm designed for repeated posted-price auctions
β Its transformation ππ£π° π΅ is an algorithm for repeated SP auctions as follows
πA
AAlgorithm π΅ β β πt
Aβ β πu
Aβ
Buyers: Reserve Prices (only one non-barrage in a round): Reserve prices are set by:. . . β πp
pβ β πv
pβ β πw
p. . . β β πo
- β
β πx
- β
β . . . Algorithm π΅ Algorithm π΅
Rounds, π’ = Periods, π = 1 2 3 4 5 6 7 81 2 3
Transformation di
div: stopping rule
We stop considering a buyer π in periods when π₯$ < π£m for some buyer π.
β The number of periods with buyer π is referred to as subhorizon, π½$.
πE
AAlgorithm π΅ β β β β β β β
Buyers: Reserve Prices: Reserve prices are set by:. . . β πE|A
pβ β πE|o
pβ πE|v
p. . . β β πE|p
- β
β πE|t
- β
. . . Algorithm π΅ Algorithm π΅
Rounds, π’ = Periods, π = π‘ π‘ + 1 π‘ + 2 π‘ + 3 π‘ + 4 π‘ + 5 π‘ + 6 π‘ + 7π π + 1 π + 2 πE|u
pπE|x
- π + 3
Transformation div: regret decomposition
Lemma 1. For the described transformation, strategic regret has decomposition: SReg π, ππ£π° π΅ , π€$ $, πΏ$ $ = = M Reg$(π, π΅, π€$, πΏ$)
- $
+ M π½$(max
mπ€m β π€$)
- $
Deviation regret Measures how fast we stop learning
- f non-maximal valuations
Individual regrets Measure how the algorithm π΅ learns the valuation of each buyer
Key challenge against strategic buyer
Strategic buyer may lie and mislead algorithms, thus a good algorithm must Extract correct information about a buyerβs valuation from his actions (bids)
β Dividing structure in a round allows to construct a tool to locate valuations: β it is enough to make complete information situation in a round
Upper bound on valuation of strategic buyer
Let buyer π is the non-βeliminated β one in a round π’.
β If the buyer accepts (bids above) the current reserve price π>
$Surplus> = π½ πΏ$
>GAπ $@$ 4 W (π€$ β π> $) + π½ MπΏ$
EGAπ $@$ 4 O (π€$ β πE $) B E@>|Aβ If the buyer rejects (bids below) the current reserve price π>
$Surplus> = π½ M πΏ$
EGAπ $@$ 4 O (π€$ β πE $) B E@>|ββ€ πΏ$
>|β GA1 β πΏ$ (π€$ β [lowest_price]) If we observe that a buyer rejects non-βbarrageβ reserve price, then: π€$ β π>
$ <- Ε½
- AGβ’Ε½Gβ’Ε½
- (π>
πΏ$
>GAπ β’W Ε½ββW Ε½ (π€$ β π> $)=πΏ$ >GA (π€$ β π> $)= β€
Optimal algorithm
Pricing algorithm divPRRFES
Apply the transformation div
div
to PRRFES algorithm
divPRRFES: individual and deviation regrets
β Individual regrets
Our tool to locate valuations provides the upper bound (as in 1-buyer case): Reg$ π, π΅, π€$, πΏ$ = π logp logp π βπ
β Deviation regrets
βΊ For each buyer π with non-maximal valuation (i.e., π€$ < max
mπ€m)
βΊ We can upper bound its subhorizon π½$:
π½$ β€ π· max
mπ€m β π€$
divPRRFES is optimal
Theorem. Let πΏ_ β (0,1) Then for the pricing algorithm divPRRFES π΅ with:
βΊ the number of penalization rounds π β₯ logβ’β
AGβ’β pand
βΊ the exploitation rate π π = 2pβ’, π β β€|,
for any valuations π€A, β¦ , π€C β 0,1 , any discounts πΏA, β¦ , πΏC β 0, πΏ_ , and π β₯ 2, the strategic regret is upper bounded: SReg π, π΅, π€$ $, πΏ$ $ β€ π· logp logp π + 2 + πΆ, π· β π π max
$ π€$ + 4 ,πΆ β (24 + 5π )(π β 1).
Summary
A novel algorithm for setting reserve prices in second-price auctions with strategic buyers. Its worst-case regret is optimal: Ξ(log log π) for πΏ < 1 Main contribution: reminding
A novel transformation that maps any pricing algorithm designed for posted-price auction to a multi-buyer setups
adrutsa@yandex.ru
Thank you!
Alexey Drutsa Yandex