Does credit card repayment behavior depend on the presentation of - - PDF document

does credit card repayment behavior depend on the
SMART_READER_LITE
LIVE PREVIEW

Does credit card repayment behavior depend on the presentation of - - PDF document

Does credit card repayment behavior depend on the presentation of interest payments? The cuckoo fallacy Christina E. Bannier Florian Grtner Darwin Semmler January 31, 2019 Abstract We study credit card repayment behavior in an


slide-1
SLIDE 1

Does credit card repayment behavior depend on the presentation of interest payments? The cuckoo fallacy

Christina E. Bannier∗ Florian Gärtner† Darwin Semmler‡ January 31, 2019

Abstract We study credit card repayment behavior in an experiment with 404 participants run on Amazon’s mTurk platform in 2018. We show that only a small fraction of subjects repays optimally. Instead, a large number of participants focusses on repaying the card that produces more new debts. As this is not nec- essarily also the card with the highest interest rate, we refer to the ensuing allocation error as the cuckoo

  • fallacy. Employing different treatments that present relevant information in different ways, we can show

that participants can be nudged away from the cuckoo fallacy. Interestingly, while financially literate participants succumb less often to this irrational repayment strategy, the nudging effect of information presentation persists irrespective of financial literacy.

JEL Classification: D91, G11, D83, J26 Keywords: Household finance, credit cards, financial literacy, rationality, bias, cuckoo fallacy

∗Professor of Banking & Finance, Justus-Liebig-University Giessen, Licher Str. 62, 35394 Giessen, Germany, Phone: +49 641

99 22551, Fax: +49 641 99 22559, E-mail: Christina.Bannier@wirtschaft.uni-giessen.de

†Justus-Liebig-University Giessen, Licher Str.

58, 35394 Giessen, Germany, Phone: +49 641 99 22595, E-mail: Florian.Gärtner@wirtschaft.uni-giessen.de

‡Justus-Liebig-University Giessen, Licher Str.

58, 35394 Giessen, Germany, Phone: +49 641 99 22594, E-mail: Darwin.Semmler@wirtschaft.uni-giessen.de

1

slide-2
SLIDE 2

1 Introduction

Over the last 20 years, several household finance puzzles, i.e. deviations from optimal behavior as deduced by rational choice, have been identified (Beshears et al. 2018, DellaVigna 2009, Zinman 2015): People hold low-interest assets and high-interest credit card debts at the same time (Gorbachev & Luengo-Prado forthcoming, Gross & Souleles 2002, Laibson et al. 2001), choose more expensive credit cards when they could also choose a cheaper one (Agarwal et al. 2015), fail to refinance their mortgages efficiently (Keys et al. 2016), and are influenced by anchoring (Keys & Wang 2016, Stewart 2009) and simple reminders that could be interpreted as priming (Stango & Zinman 2014). Among more recent studies, the credit card debt puzzle has received particular attention. It posits that, when endowed with several credit cards, a significant fraction of people do not repay their debts in an interest-minimizing way (Gathergood et al. 2018, Ponce et al. 2017). Instead, simpler heuristics like balance matching (Gathergood et al. 2018) or mental accounting (Ponce et al. 2017, Thaler 1985) seem to explain the observed behavior better than rational optimization. Inspired by these results, we reexamine the strategies that people use to repay their debts. To do so, we develop a simple experimental setting where participants hold credit card accounts with negative balances and are provided with an income stream over several rounds, which they can use to repay these debts. We run a pilot study on Amazon mTurk and implement several treatments to elicit different repayment

  • behaviors. Our pilot study shows one strategy to be particularly pervasive that appears to complement the

balance matching and mental accounting strategies derived in earlier work. Based on this pre-finding, we hypothesize that participants focus too strongly on repaying the credit card with the highest current level of interest payment and too little on repaying the card that allows for the strongest reduction of interest payment. The level of interest payment per card hence appears to be the more urgent or more easily processible issue in comparison to a change in the interest payment, thus triggering a repayment decision that is irrational if the card with the highest level of interest payment is not also the card with the highest interest rate. We call this the cuckoo fallacy as it mirrors the behavior of parenting birds feeding first of all the largest and most urgently pleading fledgling in their nest, which might turn out to be a cuckoo. We test our hypothesis with data derived from different experiments run on mTurk with subjects from the US. In the experiments, participants are endowed with two credit card accounts that show the same initial debt levels but feature different interest rates: There is a low-interest and a high-interest credit card. Participants receive a constant income stream in every round that they can use to repay the two credit cards 2

slide-3
SLIDE 3

in a fully flexible way. In the basic treatment, participants are provided with the necessary information in a very neutral way. I.e., they learn the debt levels and income level per period and the interest rates on each credit card. Two alternative treatments are then designed so as to draw participants’ attention either to the current level of interest payment per card or to the reduction in overall interest payment due to the

  • repayment. We measure the potential amount of misallocation per participant as the deviation from the
  • ptimal repayment behavior and study whether it is affected by the two treatments. Interestingly, we find

that only 18.32% of all participants in the basic treatment choose the optimal strategy and repay the credit card with the highest interest rate in all experimental rounds. This proportion is much lower (11.11%) in the treatment that focuses attention on the level of interest payments per card, but much higher (26.81%) in the treatment that emphasizes the overall reduction in interest payments. Interestingly, when we test for the treatments’ influence on the allocation error in a multivariate analysis, we find a significant effect only for the treatment that emphasizes the total reduction in interest payments: Here, the allocation error is significantly smaller as compared to the basic treatment. The allocation error does not increase as compared to the basic treatment, however, when the levels of interest payments per credit card are emphasized. We conclude from this observation that participants may be nudged towards more rational behavior by presenting relevant information in an appropriate way, but will succumb to the cuckoo fallacy in a similar fashion in any other case. Our multivariate results are robust against controlling for person specific characteristics such as gender and age, but also characteristics more specific with regard to the decision at hand, such as financial literacy or experience with credit cards. Even though we find that financial literacy correlates negatively with the allocation error, it might be surprising that the nudging effect

  • f the treatment is not affected by the financial literacy of participants. I.e., participants show a significantly

lower allocation error in the treatment that emphasizes the overall reduction in interest payments, irrespective

  • f whether their financial literacy is high or low.

It should be noted that the cuckoo fallacy becomes relevant only in cases where the credit card with the low interest rate shows a sufficiently higher debt level than the credit card with the high rate. This is because the interest payment will then be higher on the low-interest rate card than on the high-interest rate card, so that repaying the former - due to its higher level of interest payment - turns out to be irrational. As our experiments start with an equal debt level on both credit cards in the first round, this situation arises only in later rounds. When we account for this effect and focus only on these later rounds where the irrationality may truly arise, we still find that the treatment that focuses on the overall reductions in interest payments 3

slide-4
SLIDE 4

is associated with a lower allocation error. Our results are hence not driven by the fact that this particular treatment may reduce the incidence of the cuckoo fallacy in the first place. Further robustness tests consider an alternative non-parametric test, different measurements of the allocation error and account for inattention effects of participants. They all support our earlier findings. We conclude from our results that potential allocation errors in credit card repayment strategies do not

  • nly stem from balance matching or mental accounting as has been shown by earlier work. Rather, the

phenomenon of focussing too strongly on the level of interest payments per credit card and disregarding the reduction in interest payments, which we refer to as the cuckoo fallacy, also appears to robustly contribute to

  • misallocations. We show that the allociation error caused by the cuckoo fallacy may be influenced by the way

that information on these accounts is provided. More precisely, our findings indicate that by actively shifting creditors’ attention to the reduction in interest payments that follows from a specific repayment decision, costly inefficiencies in repayment behavior can be reduced. Interestingly, this nudging policy should work independently of a person’s financial literacy. I.e., even though financially literate participants appear to succumb much less strongly to the cuckoo fallacy in the first place, any nudging does not seem to affect them any differently than participants with lower financial education. In conclusion, our study has two goals. We first establish our new experimental paradigm and investigate, if our participants repay their debt optimally. Second, we investigate the cuckoo effect, and try to steer its

  • ccurrence by manipulating the salience of certain information. We develop a basis treatment with neutral

information, a treatment designed to support the cuckoo fallacy and a third treatment designed to suppress

  • it. The details are fleshed out in paragraph 4.

The remainder of the paper is structured as follows. Section 2 describes our hypothesis and presents the underlying theory. Section 3 explains our process of collecting data, section 4 delineates our methodology and section 5 shows the results. Section 6 concludes.

2 Theoretical background

One major advantage of studying repayment of debts in the credit card context is that it is possible to develop a simple benchmark for optimal behavior with fairly uncontroversial assumptions. Interpreting an observed behavior as a deviation from rationality is usually tricky since it is possible to rationalize basically every behavior, as long as the assumptions about the underlying preferences are convenient enough. For example, 4

slide-5
SLIDE 5

playing the lottery or visiting the casino, which usually implies playing gambles with negative expected values for the customer, can be rationalized by assuming risk seeking preferences. Gifting and donating money can be rationalized by assuming social preferences, and so on. Therefore, while rational choice under conventional assumptions predicts that debtors minimize their total debts, implying that they should always repay the cards with the highest interest rates first, it is still possible to interpret deviations from that hypothesis as rational, given certain preferences. However, to do so, one needs to either assume that debtors want to gift money to the lenders for some reason, or that they do not care about their wealth at all (or even want to maximize their debts), which seem to be pretty heroic assumptions - too heroic for us to believe. Thus, in this setting one can omit discussions about preferences and observe rational and not-so-rational behavior directly, as it is reasonable to define rational repayment as interest-minimizing. This also results in a simple repayment rule for rational debtors: Start repaying the most expensive card until its balance reaches zero, then continue with the second most expensive card and so on. However, two recent studies show that a huge fraction of Mexicans (Ponce et al. 2017) and British (Gathergood et al. 2018) seem to repay their debt in a non-optimal way. Ponce et al. (2017) find that “The average consumer misallocates 50 percent of her monthly payments above the minimum to pay down low-interest debt” and explain that with limited attention and mental ac-

  • counting. Gathergood et al. (2018) find similar results for Great Britain. They test several heuristics to ex-

plain observed credit card repayments in Great Britain, such as the 1/N heuristic (Benartzi & Thaler 2001), where debtors repay all credit cards with open debts equally, or strategies that focus on the balance or the capacity of cards as the respective decision criterion. While 1/N is pretty successful, the heuristic which seems to explain their data best is what they call “balance matching”, where debtors repay each card based

  • n the fraction of the total debt it represents. For example, if the debt on one card amounts to 600$ and on

the other 400$, and the debtor wants to repay 100$ in a given period using that heuristic, balance matching implies a repayment of 60$ on the first and 40$ on the second card. Their results show that this heuristic

  • utperforms all the other strategies they use to explain actual repayment behavior in field data from Great

Britain, including interest-minimization. While they cannot provide a theoretical explanation for the strategy’s explanatory power, they link its usage to the concept of probability matching (e.g. Vulkan (2000)). This term describes a behavior observed in situations of uncertainty, where two choices yield the same value but with different probabilities. While it would be rational to always chose the option with the higher probability, people (as well as pigeons 5

slide-6
SLIDE 6

(Herrnstein 1961)) often try to match their choice distribution with the probability distribution. So e.g. if

  • ne choice offers 1$ with a probability of 70% and 0$ with 30%, and the second choice 1$ with a probability
  • f 30% and 0$ with 70%, probability matching would imply that the first alternative would be taken in 70%
  • f all choices and the second in 30%. Gathergood et al. (2018) speculate that debtors might use a similar

mechanism with respect to the balance, as the balance is usually highlighted prominently on credit card

  • statements. Originally, we tried to replicate the results of Gathergood et al. (2018) and Ponce et al. (2017) in

an experimental setting to investigate the actual strategies that participants would use, hoping to advance the theoretical understanding. We developed an experimental paradigm to study credit card debt repayments, which we present later in this paper, and tested it in a pilot study on mTurk in summer 2018. We skip the details of this pilot study, but one particular finding did strike out: A substantial fraction of subjects start repaying optimally, but then deviate, as if worrying about the fact that the cheaper card starts to produce more new debts than the more expensive one. Consider a stylized example: You have a debt of 4000$ on a credit card account with a 3% interest rate per period, and 500$ on a second account with a rate of 5%. So in the next period, the 4000$ card will produce 120$ of new debts, while the 500$ card will produce only 25$. You have an income of 200$ per period that you can use to lower your debts. Should you use this money to lower the comparably large amount of 120$ of new debt and ignore the small amount of only 25$ of new debt on the other card? After all, in the next period these 120$ will be charged with 3% as well, producing 3.60$ of compound interest, while 5% of 25$ will only lead to 1.25$ of compound interest. So if you ignore the cheaper card, its debt seems to “explode”. Should you try to suppress this explosion, and thus repay the card that produces more new debts first? Rationally the answer is no, you should still repay the relatively expensive 500$ first and not the cheaper 4000$, but for many of our participants it seems to be yes. They repay the card that produces more new debts, rather than trying to reduce the overall new debts. This is optimal as long as the more expensive card produces more new debts per period, but is not when the cheaper card produces more new debts1. This finding inspired the present paper. We call this the “cuckoo fallacy” as it describes a situation that

1If you do think that you should try to focus on the card that produces more new debts, consider the two extreme cases of our

example, each for two successive rounds: First, repay the full 200$ on the 4000$ account in both periods. This leads to an amount

  • f total debt of 3800$ · 1.03 + 500$ · 1.05 = 4439$ in the first period and to a total debt of 3914$ · 1.03 + 525$ · 1.05 = 4376.67$

in the second round. Now, instead consider repaying 200$ on the 500$ account in both rounds. Then in the next period you have a total amount of debt of 4000$ · 1.03 + 300$ · 1.05 = 4435$ and to a total debt of 4120$ · 1.03 + 315$ · 1.05 = 4364.35$ in the second

  • period. Repaying the more expensive card that produces lower new debts saves 4$ in the first and 12.32$ in the second round.

6

slide-7
SLIDE 7

is similar to parenting birds that tend to the largest and most urgently cheeping fledgling in their nest before feeding the rest. Unfortunately, this fledgling may turn out to be a cuckoo chick. This fallacy seems to be related to matching behavior as well as mental accounting. We assume it is a consequence of limited cog- nitive capacity. According to mental accounting Thaler (1985), agents organize their total expenses, which usually include dozens or hundreds of single expenditures per month, into simpler, easier manageable mental

  • subaccounts. Then they provide budgets to these accounts, depending on their preferences. Analogously, we

propose that when managing debt, agents use their actual account structure to split their total debt into sep- arate mental accounts, which are easier to manage than an aggregated total as well, because the aggregation

  • f debts is challenging. Especially it is easier to manage and forecast changes of the total debt caused by

interests if it is divided into several accounts. This division alone however does not explain which information agents use when they try to solve the problem of repayment. In our context, we assume that they use the most salient difference between the ac-

  • counts. This is also the way Gathergood et al. (2018) explain the usage of balance matching: The balances

are highlighted on credit card statements, so these more salient differences in balances are driving the be- havior and not the less salient information on interest rates. We propose that a third information that agents are using, is the difference in the amount of new debt each card produces. This difference becomes more salient the larger it is. Once it is more salient than the differences in interest rates and balances, it provides agents with a simple heuristic: Repay the card that produces the most new debt, until the difference in new debts is small enough that it is becoming less salient than other differences again. The cuckoo fallacy is a consequence of this heuristic, which occurs whenever it is the cheaper account that produces more new debts. In the following, we conduct an experiment testing this theory with different treatments, which shall either increase or decrease the probability of falling for the cuckoo fallacy.

3 Data

3.1 Gathering Data on mTurk

We ran our experiments on Amazon’s crowdsourcing platform Mechanical Turk (mTurk). Crowdsourcing platforms are a relatively new way to conduct experiments, but are becoming more and more common as a convenient sample in the social sciences. Like any convenient sample, their usage is controversial. Sceptics 7

slide-8
SLIDE 8

raise concerns about a lack of attention by the “Turkers”, control problems, too experienced subjects and low external validity (e.g. Chandler et al. (2014, 2015), Ford (2017)). Since most of those concerns are relatively easy to study, Turkers are one of the most extensively and thoroughly examined sample population. Recent papers that discuss potential issues include Chandler & Shapiro (2016), Goodman & Paolacci (2017), Hauser et al. (2018) and Miller et al. (2017). The findings in general seem to imply that the data quality of Turkers is somewhat worse than that of actual representative samples, but outperforms that of common convenient samples such as undergraduates. Turkers from the US seem to resemble the general US population relatively well (Huff & Tingley 2015, McCredie & Morey 2018, Paolacci et al. 2010, Snowberg & Yariv 2018), and especially better than student samples (Snowberg & Yariv (2018), Roulin (2015), however see Krupnikov & Levine (2014)). They produce data of relatively high quality (Kees et al. 2017, Snowberg & Yariv 2018) and seem to be more attentive than students (Hauser & Schwarz 2016, Ramsey et al. 2016). Replications

  • f classical studies of psychology, political sciences and economics are usually successful (e.g. Amir et al.

(2012), Berinsky et al. (2012), Coppock (2018), Crump et al. (2013), Horton et al. (2011), Mullinix et al. (2015), Wolfson & Bartkus (2013)), though not every result is replicable – which is not too surprising given recent replication problems in social sciences and economics (e.g. Camerer et al. (2018), Open Science Collaboration (2015)). However, data quality seems to fall once non-native English speakers are included (e.g. Goodman et al. (2013)). Since we restricted our sample to the US population and implemented additional checks and methods to ensure high data quality, we believe that our data is of high quality and performs at least as well as any data we could acquire by using common lab samples.

3.2 Ensuring Data quality

The work of Turkers on any given HIT can be approved or rejected by the requester of that HIT. A HIT is a given set of tasks that the Turker has to work on, e.g. our experiment from start to finish is one single HIT. We restrict participation to Turkers with at least 100 completed HITs to screen out throwaway accounts, bots and new Turkers, that we expect to make more mistakes, simply because they are unfamiliar with mTurk as a whole. We set an approval rate on former HITs of at least 95%, a common threshold that was shown to ensure high data quality (Peer et al. 2014). To set up another hurdle for simple bots and to make sure that our subjects have a basic level of numeracy, we asked to calculate 1% of 1000 in an open question. To exclude more advanced bots, we asked our participants to describe the strategies they use in the experiment 8

slide-9
SLIDE 9

in an open question after they finished the experimental stage. We analyzed qualitatively if the answers fit the question, which is the case for all our subjects. Thus, we are very confident that our data does not include any bot. To ensure attention, we included two attention check questions. In the first question, right after the numeracy question, participants had to agree or disagree with the statement “All my friends are from outer space”. If someone agreed, we screened her out. The second question was implemented in the financial literacy questionnaire. We gave subjects the choice to select between choices that we labeled “First answer” and “Second answer” and asked them to select “Second answer”. We screened out everyone who selected “First answer”.

4 Research Design

4.1 The basic experimental paradigm

We developed a simple but highly flexible and adaptable experimental framework, which we lay out in the following: Subjects were endowed with a number of credit card accounts and a checking account. We framed these accounts as such to increase clarity. The experiment started with negative opening balances on the credit card accounts and a certain amount of wealth on the checking account. The credit card accounts charged interests (usually different interest rates), while the checking account paid no interest. The game was played for several rounds. In each round, starting from round 1 on, an income was paid on the checking account, which subjects could freely distribute between the credit card accounts or hold on the checking

  • account. Subjects decided themselves when a round was finished. If they held any money on the checking

account at the end of a round, it was transferred to the next round. Between the rounds, the interests of the credit card accounts were calculated and added to the balance in the next round. Then another round started, until the game was over. We also charged the interests after the last round, because otherwise the data of the last round is, at least in theory, of no use since participants would not have incentives to actually repay anything without these interests. Subjects were paid a show up fee plus any bonus that may have resulted from the actual experiment. To calculate the bonus we measured the repayment efficiency. We calculated the maximum possible and the minimum possible debt after the end of the experiment and defined the repayment efficiency2 as the percentage of the actual debt between maximum and minimum debt.

2Let min be the minimal possible amount of debts in the end of the experiment, that is the result, one has when paying optimal

(-2988.51$) and let max be the maximal possible amount of debts, that is the result, one has when doing nothing (-3790.20$). Furthermore, let debt be the actual amount of debts a participant has in the end of the experiment, then we define the repayment efficiency by e f f = 1 − debt−min

max−min.

9

slide-10
SLIDE 10

4.2 Shared features of all treatments

There are screenshots of all screens seen during the experiment in the appendix. The following is a descrip- tion of the implementation: We informed subjects about their compensation scheme and about the main task. Then two compre- hension questions and the already mentioned numeracy/anti-bot question were asked, followed by the first attention question. After that, every subject had the chance to try out the experiment mechanics in two trial

  • rounds. We assumed that two rounds are enough to become familiar with the charging of interest and the

refilling of the checking account, but not enough to trigger learning effects before the main experiment. For the trial rounds, we used different values than for the main stage, because we did not want the subjects to apply their behavior from the trial rounds to the main experiment3. After the practice rounds the main experiment started. Subjects had to repay debts on two accounts for ten rounds. We set the starting debts on each credit card to 2200$ and the starting endowment on the checking account to 250$. One of the credit cards had an interest rate of 3% per round and the other one of 5% per round. In every round the checking account was refilled with 250$. To rule out any effects of order, we assigned the interest rates to the two credit cards randomly for each participant. We chose these particular values for account levels and interest rates because they fulfill several condi- tions:

  • Both credit cards start with the same amount of debts. This ensures that the only difference between

the cards early on are the interest rates, maximizing the chance for optimal repayment in the early stages of the experiment.

  • It is not possible to repay one of the cards completely in ten rounds. For our research questions we

are only interested in situations where subjects actually have to make a choice between two cards. Therefore ruling out this possibility ensures that we can evaluate every round of each subject.

  • The total new debt on both cards in each round does not exceed the deposit in the checking account,

such that subjects do not get the impression of pointless repayments4.

3In the trials there were 1000$ debts on each credit card and 300$ starting balance on the checking account. Interest rates were

the same as in the main rounds.

4However, if a subject does not repay anything at all, then her total new debts do exceed the checking account deposits in round

  • 9. But if subjects do not repay at all, their “repayment” behavior could not be distracted by any feelings of fatalism anyways.

10

slide-11
SLIDE 11
  • To let the cuckoo fallacy arise, it is necessary that the cheap card produces more new debts than the

expensive one at some point. In our experiment, this can happen for the first time in the sixth round, i.e. in the second half of the experiment. Each round of a treatment started with the exact same interface. Subjects were shown a repayment screen, which was structured as follows: At the top there was a short introduction text to inform the participants what they are supposed to do now and in which round they are. Below there was an information box with the relevant information about current account balances and interests. This information box was the only thing that varied between the different manipulations of the experiment. Below that information box we placed the transfer box. Subjects could transfer money from their checking account to their credit cards by typing in the amount of money to transfer and choosing the credit card the money should be transferred to. With a “transfer”-button the subjects can confirm their payment. It is possible to specify the transferred money to the cent. However, negative inputs or inputs that would lead to an overdraw of the checking account or an over-repayment on the credit cards produce an error. The participants can process as many transfers as they want within a round. They are also able to reverse their doings and reset all the payments in the current round with a “reset payments”-button. On the left side of the reset button, there is the “next round”-button, where the subjects can confirm their payments of the current round and start the next round. After confirming the current round, it is not possible to return to a former round. Although the transfer- and the next round- button are far enough away, we ensure that the subjects do not confuse the buttons and accidentally go to the next round by placing a warning if someone clicks on “next round”, while there is a number typed in in the transfer box. At the very bottom, we place a copy of the experiment rules from the beginning of the experiment as a reminder in case a participant wants to reread the rules. After subjects finish a round, an interlude screen is shown that tells subjects that the interests are calculated and that their checking account is refilled with 250$. Then the repayment screen of the next round is shown, and so on. After the experimental stage is finished, we conduct a short questionnaire. First we ask participants for their repayment strategy in an open question. As already mentioned we use this question to screen out advanced bots. To measure familiarity with credit cards, we then ask how many cards subjects own, and how many additional cards they have access to, via spouse, parents, and so on. We also ask if subjects have access to credit cards via work, and if they usually do not use credit cards, but at least know how they work. We use all these questions as categorical variables in our analysis. 11

slide-12
SLIDE 12

We measure financial literacy using the “Big Three” financial literacy questions (Lusardi & Mitchell 2011) and add three further questions focusing specifically on debt literacy, created by Lusardi & Tufano (2015). We use these six questions to create an index of financial literacy, where we count the number of correctly answered questions, ranging from 0 to 6. Our last questions concerns demographics, where we ask for gender, age, and years of formal education (the latter was asked via item F16 of the European Social Survey (2016)). Subjects are paid a participation fee of 1$ plus a bonus up to 2$ multiplied with their repayment ef- ficiency, that is, they receive a higher bonus the less debts they have at the end of the experiment. Thus, subjects earn between 1$ and 3$, as they cannot fall below the participation fee independent of their perfor- mance in the experiment. On average, our participants earn 2.8$ in roughly 20 minutes, which implies an hourly wage of around 8.4$. Our payments are a bit lower than average payments in lab experiments, but seems to be higher than the median payment on mTurk. Hara et al. (2017) estimate the median wage on mTurk to be lower than 2$ and the mean wage slightly above 3$. Berg (2016) estimates an average hourly wage of around 5,50$. In the pilot study, we also ask our subjects to estimate their average hourly wage at

  • mTurk. Their average estimate implies a median of around 7$ and a mean of roughly 8$ per hour. Given that

we probably overpay our subjects we are confident, that the material incentives work.

4.3 Manipulations

In the treatments all the values, repayment options and the optimal strategy are the same to compare the behavior between the treatments. We only manipulate the presentation of information. Furthermore, the treatments vary only in the trial and main experiment rounds. Explanations, comprehension tasks and post- experimental questionnaire are all the same in all treatments (see appendix). In particular, we only manipu- late the presentation of the information box. We use the following three treatments for the experiment:

  • In the basic treatment, we show all the information as plainly as possible. The subjects see a table

with the current account balances for each account and the interest rates for the credit cards. The account balances are updated automatically after each transfer, so that the subjects always see the current account balances without having to calculate it for themselves. The column for the interest rates shows the same information all the time, because the interest rates are stable during the whole experiment. 12

slide-13
SLIDE 13
  • In the “ShowNewDebts”-treatment we want the participants to focus on the new debts and to neglect

the interest rates. According to our theory, this is going to decrease the deposits of the participants in the high interest card in later rounds due to the cuckoo fallacy. In comparison to the info box of the basic treatment, we maintain the table with the three accounts and the column with the current account

  • balances. However, we replace the column with the interest rates with a column that shows the newly

added debts for the next round considering the current balances. The cell for the checking account is empty, but the new debts for the other account is shown in red. However, the participants are able to check the interest rates at any time by clicking on a button below the info box. Since the interest rates are never shown directly during the rounds, we display them for a single time immediately before the experiment stage.

  • In the “ShowSavedMoney”-treatment we want the participants to focus on the interest rates and to

neglect the individual balances on the credit cards. According to our theory, this is going to increase the repayments of the high interest card by reducing the cuckoo fallacy. Instead of a table with a line for each account, we show different information line by line. Subjects can see their balance on the checking account in black, then the total amount of debts on both credit cards in red and finally how much their interest amount for the next round will decrease with the current repayment in comparison to no repayment in green. Additionally, participants are shown for each credit card how much interest they save for each dollar they repay (which is basically an alternative, but more intuitive, presentation

  • f the interest rates). However, the subjects are able to see the individual credit card balances anytime

by clicking on a button below the info box. But since the individual credit card balances and interest rates are never shown directly during the rounds, we display them a single time immediately before the experiment stage. We design the manipulations to steer the attention of the participants to certain information. Thus, we use colors to reinforce the changes of attention between the manipulations. The exact individual contributions

  • f the effects of colors and placement of information to attention differences have to be investigated in future

studies. 13

slide-14
SLIDE 14

4.4 Operationalization of the repayment behavior

For each round of each participant, we have data on the account balances of checking account and credit cards after the repayment. From this, we also can calculate the account balances before the rounds and the payments made by the participants in each round. To measure the error of allocation we compare the actual payments with the payments one should make to minimize their debt (allocate everything from the checking account to the credit card with the highest interest rate): Let pi j be the three-dimensional vector of payments made by participant i in round j and oi j the optimal payment for participant i in round j. Then we define the allocation error ei j as ei j = ||pi j − oi j||2 2Mi j = 1 2Mi j

  • 3
  • k=1

(pi jk − oi jk)2 that is the root mean square error of pi j − oi j divided by twice the maximum allocable amount Mi j of money for participant i in round j. It is necessary to use twice the value of Mi j, because each transfer shows up as a plus on the credit card as well as a minus on the checking account. Note that the allocation error ei j is mathematically equivalent to 1 − hi j, with hi j as the share of the allocable money which was allocated to the high-interest credit card. For our analysis, we use the value ei, defined as the mean allocation error for participant i over all ten rounds. With the mean one can ensure, that every participant generates exactly one data point that is independent from the others. We do not find endogeneity in the resulting data. The difference between actual allocated money and supposed allocated money is the basis for every error calculation in this study. However, we could use another calculation to determine a measure for allocation

  • error. In Gathergood et al. (2018), the mean absolute error was used as alternative measure for allocation
  • errors. But since all our results will stay the same if we switch the measure this way, we decide on the root

mean square error as our measure of choice, not least because one can interpret it in the above mentioned simple way (see robustness checks). 14

slide-15
SLIDE 15

5 Results

5.1 Participants

We used variance and effect size in the pilot study to estimate the number of participants we need in the main study to varify possible differences between the treatments with a probability of at least 80%. A power test suggested 132 subjects in each treatment. We conduct the main experiment with a total of 414 participants from the United States on mTurk. We prohibit multiple participation by using mTurk’s option to block Turkers that took part in earlier sessions and by observing the worker ID of every subject. Ten subjects failed the attentions tasks and were screened out, leaving us with a total of 404 participants. Control variables are financial literacy, age, gender, the number of

  • wned credit cards, the number of additionally accessible credit cards (for example at work or via a family

member) and the number of years of education. The data contains 190 females, 213 males and 1 third gender. The subjects’ age ranges from 19 up to 75 years, with an average of 37.1 years and a median age of 35 years. From the six questions on financial literacy, on average 3.73 are answered correctly. The participants own 2.65 credit cards and have access to additional 0.66 credit cards on average. The mean of the years of educations is 15.28. There are 131 subjects in the basic treatment, 135 subjects do the ShowNewDebts- and 138 the ShowSavedMoney-treatment. Table 1: Summary statistics Statistic N Mean Median

  • St. Dev.

Min Max Financial Literacy 404 3.730 4 1.382 6 Age 404 37.097 35 10.693 19 75 # Credit cards 386 2.653 2 2.684 20 # Additionally accessible credit cards 382 0.665 1.250 10 # Years of education 404 15.277 16 2.316 9 21 Treatments N Female Male Third gender Basic treatment 131 58 72 1 ShowNewDebts-treatment 135 62 73 ShowSavedMoney-treatment 138 70 68 Sum 404 190 213 1 15

slide-16
SLIDE 16

5.2 Proportion of optimal-behaving subjects

In every treatment we investigate the proportion of the participants with optimal behavior. A participant is considered as optimally-behaving if she commits an allocation error of zero, that is, she pays everything on the high-interest credit card in every round. With this definition, 76 of the 404 participants (18.81%) behave

  • ptimally. Comparing the treatments, 18.32% of the subjects in the basis group repay optimally, 11.11% in

the ShowNewDebts group and 26.81% in the ShowSavedMoney group (Fig. 1). The differences between the basic group and the treatment groups are both significant on the 10% level in a logistic regression (see table 6 in the appendix). Given that our hypotheses are directed, this implies significance on a 5% level. The difference between both treatments is significant on a 1% level in a two-sided test. However, the differences between the basic treatment and the experimental treatments turn insignificant once control variables are included. The allocation error in general differs from zero significantly, confirming that people do not allocate their money optimally. However, looking at the proportion of subjects with optimal behavior alone is too rough, as this ignores any changes within the group of participants that repay non-optimally. Therefore, we will look into the much more accurate allocation error in the next section.

basic ShowNewDebts ShowSavedMoney

Proportion of optimally behaving participants

Percentage 0.0 0.1 0.2 0.3 0.4 0.5

Figure 1: Proportion of optimal-behaving subjects split by treatment 16

slide-17
SLIDE 17

5.3 Differences in allocation error between treatments

For every treatment we calculate the overall allocation error AE as the mean of the allocation error ei j for each participant and each round. Let nT be the number of participants within the treatment T. AET = 1 nT

nT

  • i=1

         1 10

10

  • j=1

ei j          We hypothesize this allocation error to be affected by treatment and control variables. Hence, we obtain the following linear regression model: AEi = β0 + β1 · S howNewDebts + β2 · S howS avedMoney + β3 · fin. literacy + β4 · fin.lit. · S howNewDebts + β5 · fin.lit. · S howS avedMoney + β6 · yoe + β7 · male + β8 · thirdgender + β9 · age + β10 · #creditcards + β11 · #access. creditcards + β12 · AtWork + β13 · NotUse + β14 · AtWork_NotUse + ǫi Note that β0 is the constant, that is the average allocation error of a woman in the basic treatment and a value of zero in all the numerical variables. The coefficients β1 and β2 measure the difference in the other treatments with S howNewDebts and S howS avedMoney as dummy variables. With β4 and β5 we include the interaction between financial literacy and treatment, meaning that our model allows different effects of the treatment on the allocation error with respect to different financial literacy. By doing so we have to centralize the financial literacy variable, because the difference in the treatments regarding the allocation error now depends on the value of financial literacy. Given that this variable ranges from 0 to 6, we use 3 as reference. β6 represents the number of years of education, β7 stands for the change in the allocation error, if the subject is male instead of female, as well as β8 does the same for third genders. β10 and β11 address the questions for the number of own credit cards and the number of additionally accessible credit cards (e.g. via spouse, friends), β12, β13 and β14 respect changes in the allocation error, if a participant uses credit cards at work, usually does not use credit cards or both together. As with gender and treatments, the belonging variables are dummy variables. Looking on the model with all control variables in table 2, the average allocation error in the basic 17

slide-18
SLIDE 18

treatment is 27.4%, 31.16% in the ShowNewDebts- and 18.68% in the ShowSavedMoney-treatment. On a significance level of 5%, the ShowSavedMoney-treatment significantly reduces the allocation error by 8% compared to the basis group, supporting our hypothesis that this presentation helps subjects to overcome the cuckoo fallacy. The ShowNewDebts-treatment increases the allocation error by 3.1% compared to the basis group, which is not significant, but at least in line with the hypothesis that this treatment should increase the appearance of the cuckoo fallacy. One possible reason for this non-significance may be that an already substantial amount of subjects in the basic treatment focusses on new debts as well. This explanation is supported by analyzing the question in which we asked participants to describe their strategies. Many of the participants in the basic group report a strategy showing that they in fact fall for the cuckoo fallacy. High financial literacy significantly decreases the allocation error, while variables like gender, age and experience with credit cards do not seem to have an effect on the allocation error. That is, people with a higher financial literacy distribute their money better in terms of minimizing debts. However, we do not find an interaction between financial literacy and treatment. Following Akaike’s information criterion, the best regression model includes only financial literacy and the years of education as control variables: AEi = β0 + β1 · S howNewDebts + β2 · S howS avedMoney + β3 · fin. literacy + β6 · yoe + ǫi Additionally we consider a model without any control variables, that simply shows the dependence of the allocation error from treatment. All three models (simple, Akaike-optimal, complete) show a significant difference in allocation error between the basic and the ShowSavedMoney-treatment, which implies that this way of presenting the information leads to an improvement of participant behavior. Table 3 shows all three

  • models. Note that the basic treatment is considered as default treatment, i.e. the constant of the regression

refers to this treatment and the dummy variables of the other treatment are to be understood as differences to the basic treatment. In the same way the estimations of the dummy variables male and thirdgender are to be understood as differences to the females. It is also notable, that the R-squared values of all models are low. But given that the dependent variable can only vary between 0 and 1 and that the subjects - especially on MTurk - behave heterogeneously and exploit a large part of this span, the R-squared values are in the expected area. Additional tests with less requirements can be found in the robustness checks. 18

slide-19
SLIDE 19

Table 2: Allocation error in the treatments Allocation error N Mean Median

  • St. Dev.

Min Pctl(25) Pctl(75) Max All data 404 0.257 0.295 0.202 0.000 0.080 0.378 1.000 basic treatment 131 0.274 0.3 0.203 0.000 0.100 0.400 1.000 ShowNewDebts 135 0.312 0.314 0.202 0.000 0.230 0.388 1.000 ShowSavedMoney 138 0.187 0.147 0.18 0.000 0.000 0.335 1.000

  • basic

ShowNewDebts ShowSavedMoney 0.0 0.2 0.4 0.6 0.8 1.0

Boxplots of the allocation errors

treatment AE

Figure 2: Allocation error split by treatment 19

slide-20
SLIDE 20

Table 3: Linear regression of the allocation error Dependent variable: Allocation error Whole model (1) Akaike-optimal (2) Minimal model (3) ShowNewDebts 0.031 0.030 0.038 (0.026) (0.023) (0.024) ShowSavedMoney −0.080∗∗∗ −0.091∗∗∗ −0.087∗∗∗ (0.027) (0.023) (0.024) Financial literacy −0.044∗∗∗ −0.046∗∗∗ (0.013) (0.007) Years of education (yoe) −0.007∗ −0.007∗ (0.004) (0.004) Male 0.018 (0.020) Third gender 0.058 (0.180) Age 0.0002 (0.001) #creditcards −0.003 (0.004) #access. creditcards 0.005 (0.008) AtWork −0.001 (0.027) NotUse −0.011 (0.028) AtWork_NotUse −0.110∗ (0.059) ShowNewDebts · Fin.lit. 0.005 (0.017) ShowSavedMoney · Fin.lit. −0.008 (0.017) Constant 0.411∗∗∗ 0.560∗∗∗ 0.274∗∗∗ (0.077) (0.066) (0.017) Observations 379 404 404 R2 0.189 0.183 0.068 Adjusted R2 0.157 0.175 0.064 Residual Std. Error 0.178 (df = 364) 0.183 (df = 399) 0.195 (df = 401) F Statistic 6.040∗∗∗ (df = 14; 364) 22.309∗∗∗ (df = 4; 399) 14.706∗∗∗ (df = 2; 401) Note:

∗p<0.1; ∗∗p<0.05; ∗∗∗p<0.01

Financial literacy was centralized at a value of 3. 20

slide-21
SLIDE 21

basic ShowNewDebts ShowSavedMoney

Barplots of the allocation error

treatment AE 0.0 0.1 0.2 0.3 0.4 0.5

  • Figure 3: Allocation error split by treatment with pointwise 95%-confidence intervals

We can have a closer look at the repayment behavior by investigating the development of the mean payments on the high-interest credit card over the different rounds. The overall payment on the high-interest credit card varies roughly between 60% and 90%, depending on the treatment. In Fig. 4, we can see that the curves of basic treatment and ShowNewDebts-treatment are close together. Both seem to have a cut after round five, where the payments on the high-interest credit card decrease. We explain this effect by the

  • ccurrence of the cuckoo fallacy: Although the new debts per card are not explicitly mentioned in the basic

treatment, people seem to base their decision on this value anyway. Since round six is the first round where the low-interest card can produce more new debts than the high-interest card, it is the first round where a participant that focuses on repaying the card which produces more debts first can succumb to the fallacy. Only in the ShowSavedMoney-treatment the curve is nearly stable over time, probably because participants do not get the information on individual credit card balances directly. Furthermore, in the ShowNewDebts-treatment, we can see a zigzag-pattern beginning in round 5. We explain this with an amplified cuckoo effect. Since the ShowNewDebts-treatment steers attention onto the new debts, subjects are influenced by it, generating the pattern in the time series of this treatment. 21

slide-22
SLIDE 22

2 4 6 8 10 0.0 0.2 0.4 0.6 0.8 1.0

Mean repayment on the high−interest credit card

Roundno. Percentage Basic ShowNewDebts ShowSavedMoney

Figure 4: Development of the average repayment on the high-interest credit card

5.4 Investigating the cuckoo fallacy

In order to have a closer look at the cut after round five in the two treatments showing the individual balances, we divide the data into rounds in which the high-interest credit card produces more newly added debts (high- interest class), and in rounds where this is the case for the low-interest credit card (low-interest class). To do so we needed to exclude a large group of 143 subjects who are never in the situation where the low-interest credit card produces higher new debts, because they repay too much money to the low-interest credit card. Our subsample thus consists of 82 participants in the control group, 78 in the ShowNewDebts treatment and 101 in the ShowSavedMoney treatment. This unbalanced exclusion of participants could in theory lead to confounding effects. However, keep in mind that to be excluded, a participant had to repay relatively small amounts to the more expensive card. This implies that we excluded a huge fraction of irrational people, as the amount of rounds where the cheaper card produces more new debts increases with rationality. So our remaining subsample is probably more rational than the complete sample. We have the smallest number

  • f participants (implying the highest degree of average rationality) in the treatment where we expect the

most irrational behavior, and the least rational sample in the treatment where we expect the most rational 22

slide-23
SLIDE 23
  • behavior. So keeping possible confounding effects in our analysis is actually a more conservative estimate
  • f our manipulations.

We believe that subjects who are vulnerable for the cuckoo fallacy focus on the card that produces more new debts, so someone falling for the cuckoo fallacy should act differently in the high-interest class than in the low interest class. This implies that the allocation error is higher in the low interest class. We run a regression with interaction of interest-class and treatment. The dependent variable is the allocation error. Independent variables are a dummy variable indicating if the observation is in the high-interest class (high_int_class) with it’s coefficient α signalizing the additional variable, as well as the usual control variables: AEi = β0 + α · high_int_class + β1 · S howNewDebts + β2 · S howS avedMoney + β3 · fin. literacy + β4 · fin.lit. · S howNewDebts + β5 · fin.lit. · S howS avedMoney + β6 · yoe + β7 · male + +β8 · thirdgender + β9 · age + β10 · #creditcards + β11 · #access. creditcards + β12 · AtWork + β13 · NotUse + β14 · AtWork_NotUse + ǫi Again we have a look at the interaction between treatment and financial literacy, so we centralize 3 as reference value as before. The basic treatment is considered as default treatment, so all treatment values mean differences to the basic treatment. Looking at table 4, the regressions with and without control variables show that in all treatments the deposits in the high-interest credit card are significantly lower if the low-interest card produces more newly added debts. This supports our hypothesis that the cuckoo fallacy causes important parts of the allocation

  • error. The strength of the cuckoo fallacy is moderated by the treatment, since the differences between the

treatments in the influence of the interest class are significant5. As expected, the interest class has the weakest effect in the ShowNewMoney-treatment, because participants are less likely to fall for the cuckoo fallacy, since they have no focus on new debts. In the basic treatment the difference in the interest classes are stronger, even more in the ShowNewDebts-treatment, which supports the idea, that subjects even in the basic treatment are focused on new debts, but that this focus is strengthened in the ShowNewDebts-treatment. This focus seems to have strong influence on the likelihood, that the cuckoo fallacy occurs. Considering only the

5Note, that the coefficient ”high int. class · ShowNewDebts” within the interaction between interest class and treatment is only

marginally significant (p<0.1), because the difference in the cuckoo fallacy between the basic- and ShowNewDebts-treatment is not that strong. But since the hypothesis test is two-sided, although our theory suggests the one direction, that the ShowNewDebts- treatment is supposed to strengthen the cuckoo fallacy, we can halve the p-value and have a significant result, too. To keep all p-values in the regressions comparable, we refrain from running one-sided tests in parts of the coefficients.

23

slide-24
SLIDE 24

Table 4: Allocation error split by round class Dependent variable: Allocation error Whole model (1) Akaike-optimal (2) Minimal model (3) high_int_class −0.219∗∗∗ −0.224∗∗∗ −0.224∗∗∗ (0.038) (0.037) (0.038) ShowNewDebts 0.091∗∗ 0.102∗∗∗ 0.112∗∗∗ (0.046) (0.038) (0.038) ShowSavedMoney −0.176∗∗∗ −0.188∗∗∗ −0.181∗∗∗ (0.044) (0.036) (0.036) high int. class · ShowNewDebts −0.098∗ −0.098∗ −0.098∗ (0.055) (0.054) (0.054) high int. class · ShowSavedMoney 0.139∗∗∗ 0.150∗∗∗ 0.150∗∗∗ (0.052) (0.050) (0.051) Financial literacy −0.024 −0.015∗ (0.017) (0.008) Years of education (yoe) −0.011∗∗ −0.011∗∗ (0.005) (0.005) Male −0.004 (0.023) Age −0.001 (0.001) #creditcards −0.0002 (0.004) #access. credit cards −0.007 (0.009) AtWork 0.006 (0.032) NotUse 0.008 (0.033) AtWork_NotUse −0.040 (0.065) ShowNewDebts · Fin.lit. 0.013 (0.022) ShowSavedMoney · Fin.lit. 0.002 (0.022) Constant 0.559∗∗∗ 0.523∗∗∗ 0.328∗∗∗ (0.098) (0.080) (0.027) Observations 498 522 522 R2 0.244 0.245 0.230 Adjusted R2 0.219 0.234 0.223 Residual Std. Error 0.242 (df = 481) 0.239 (df = 514) 0.241 (df = 516) F Statistic 9.688∗∗∗ (df = 16; 481) 23.788∗∗∗ (df = 7; 514) 30.839∗∗∗ (df = 5; 516) Note:

∗p<0.1; ∗∗p<0.05; ∗∗∗p<0.01

24

slide-25
SLIDE 25

high-interest class, there are no differences in the allocation error between the treatments, because in this class the participants were not in the situation, where the low-interest credit card produces higher debts. Hence, the cuckoo fallacy cannot occur. Considering the control variables, higher financial literacy and a higher number of years of education decrease the allocation error significantly, but financial literacy does not have different impacts in different treatments. Figure 5 summarizes all differences in the interest classes.

0.0 0.1 0.2 0.3 0.4 0.5

Comparison in repayments

Card that causes the higher new debts Allocation error low interest card high interest card treatment ShowNewDebts basic ShowSavedMoney

Figure 5: Payments on the high interest credit card split by the class of rounds

5.5 Qualitative Analysis

As we have noted, many subjects do not pay back their debts optimally. To investigate why, we ask partici- pants about their repayment strategies. We qualitatively analyze the answers by categorizing them. We have established the following categories:

  • Perfect: Contains participants who only pay back the highest-interest credit card and therefore repay
  • ptimally.
  • NewDebts: Contains subjects who based their decision on the new debts. From their self-report, these

25

slide-26
SLIDE 26

are the ones doing the cuckoo fallacy.

  • Share: Contains subjects who reported a strategy based on a split of the allocable money between the

two credit cards.

  • Other: A small share of our subjects have other strategies like 1/n (equal split of the available money),

repayment on the lower-interest card in order to repay it completely or the attempt to equalize the account balances.

  • ?: The rest category for subjects who do not seem to follow any strategy at all.

With this categories we obtain the following classification: Perfect NewDebts Share Other ? 136 107 103 23 35 33,66 % 26,49 % 25,5 % 5,69 % 8,66 % Table 5: Classification of the subjects in categories

Perfect NewDebts Share Other ?

Classification of the self−reported behavior of the subjects

Percentage of subjects 0.0 0.1 0.2 0.3 0.4 0.5

Figure 6: Graph of the classification of the subjects 26

slide-27
SLIDE 27

First note, that everyone of the 404 participants wrote a meaningful answer to the question of the strategy. This shows that we can assume, that the majority of participants took the study seriously and dealt with the question of the optimal strategy. The qualitative analysis of the self-reported behavior confirms the results of the allocation error measure-

  • ment. We have a large amount of subjects having the optimal repayment strategy. With almost 34% this

proportion is a bit higher than in the quantitative analysis. This can be explained by the fact that optimal behavior in the quantitative analysis is defined without any error, while the participants in the qualitative analysis report an overall strategy. However, partially they deviate from their strategy in individual rounds as a test. The focus on new debts is the second most common strategy. The self-report of the strategies confirms the assumption, that the higher allocation error in rounds of the low-interest class really is caused by the cuckoo fallacy, as some people describe this exact behavior. We have many descriptions of participants who started repaying the high interest card credit card and switched at some point to repay the other card, because the balance gets too high or it produces more new debt. In any case many subjects are distracted by the account balances (instead of paying attention to the interest rates) and switch the repayment card after some rounds. Another thing, the subjects often report is that they divide the money among the two cards. Some of them want to repay at least the new debts, others base their split on account balances or interest rates. This study is not appropriate to determine the reasons for splitting. However, it can be seen that only simple divisions (e.g., 1:2, 1:3) are taken into account, and therefore not every situation can equally create an incentive to divide the available money. In a follow-up study we will examine situations and reasons leading to splits.

5.6 Robustness checks

We test the robustness of our results in several ways. First of all we do all the analyses again with the whole sample of 414 participants, i.e. we do not exclude the screened out subjects. There were no difference in the results. With the linear regression, we automatically assume normal distributed data. We repeat the analyses

  • f the main study with non-parametric tests and without control variables. With a 3-sample binomial test

for equality of proportions, we have significant differences in the number of perfectly repaying participants between the treatments. Applying the binomial test pairwise to the three treatments, we see, that this num- ber is significantly lower in the ShowSavedMoney-treatment compared to the other two, but that there is no 27

slide-28
SLIDE 28

significant difference between the basic- and the ShowNewDebts-treatment. Furthermore, we test the differ- ence of the allocation error between the treatments through a Kruskal-Wallis-test. Since p<0.01, there are significant differences between the treatments. To be more specific, we compare the treatments in pairs with Wilcoxon’s rank sum tests and obtain significant differences in the allocation error between the basic- and the ShowSavedMoney-treatment and the ShowNewDebts- and the ShowSavedMoney-treatment, just like in the main analyses and just like when comparing the number of perfectly repaying subjects. Consider the difference in repayment behavior between rounds, where the higher interest rate credit card causes more new debts (high-interest class) and rounds, where the lower interest credit card does so (low- interest class). Again we calculate the mean allocation error for each participant over all rounds in each

  • class. To respect the dependence of the two data points for each participant, we run Wilcoxon’s signed rank
  • test. There were significant differences between the high-interest class and the low-interest class in all three
  • treatments. Furthermore, the difference between high and low class in the ShowSavedMoney-treatment is

significantly lower than the differences in the other two treatments. All these results are the same as in the main analyses, so we have no problem with our assumptions. The root mean square error is not the only possibility to obtain a measure for the allocation error. So we check the robustness of our results for another measure, the mean absolute error. With the above nomencla- ture, we get the following measure: ˜ ei j = ||pi jk − oi jk||1 2Mi j = 1 2Mi j

3

  • k=1

|pi jk − oi jk| Again we do not find any differences in the results for our hypotheses. In online surveys, there is always the risk of getting participants, who do not really care for the study and just click as fast as they can, because they are not observed by an experimenter. Most of this subjects should also fail the attention tests, so that they already have been screened out. But there may be cases, in which a participant is only attentive, when she gets to see a new screen, but rather clicks the same buttons, when there are repeated screen like the ten repayment rounds. To rule out this option too, we additionally screen out participants doing nothing for several rounds, meaning they have at least 2000$ on their checking account in the end. But it shows, there are only four participants with this behavior and screening them out changes nothing in our results. 28

slide-29
SLIDE 29

6 Conclusion

In this paper, we conduct an experiment with 404 participants from the US on the platform Amazon mTurk to study their behavior in credit card repayment. We find out that only a small fraction of subjects repays

  • ptimally. Instead, a huge fraction of participants focusses on the card that produces more new debts, leading

to a non-optimal split of repayments. Many of them repay the credit card that produces a higher amount of debt in the next interest round first - we call this error the cuckoo fallacy. Our main contribution is to show that the theorized cuckoo fallacy exists in an experimental setting and that its appearance can be manipulated by altering the presentation of information. Increasing the salience

  • f the new debt a credit card produces does not increase the error, probably because an already large number
  • f subjects seems to be worried about new debts in the basic game already. But highlighting and steering

attention onto the information that shows how much money one can save in the next round with a given repayment allocation does increase optimal repayment behavior. The findings of this study have consequences for the design of credit card statements, for financial con- sultancy and for financial education. Highlighting the amount of interest a debtor saves by repaying debts

  • n a credit card statement could improve repayment behavior. Educating them about the existence of the

fallacy and teaching them how to repay properly would be another huge step in that direction. However, it is questionable if financial institutions do have the incentives to implement this changes on their own. A limiting factor of this study is the restriction to participants from the US, so in future works it would be interesting to find out, whether this results can also be obtained in countries, where the use of credit cards is not part of everyday life. Furthermore, we limit our experiment to preferably homogeneous and comparable treatments in terms of the optimal behavior, meaning that there are no differences in the possibilities of the behavior one could show in the different treatments. That rules out changes in the number of credit cards and more realistic basic conditions like minimum payments, interest changes or overdrawing of an account. Perhaps more heuristics leading to a wrong behavior could be found if we allowed deeper changes in the treatments in the future. Future work should also look at different data sources than experimental data, preferably from the field, to test the hypothesis that the cuckoo fallacy is a mere experimental artifact. 29

slide-30
SLIDE 30

References

Agarwal, S., Chomsisengphet, S., Liu, C. & Souleles, N. S. (2015), ‘Do consumers choose the right credit contracts?’, Review of Corporate Finance Studies 4(2), 239–257. Amir, O., Rand, D. G. & Gal, Y. K. (2012), ‘Economic games on the internet: The effect of $1 stakes’, PLoS ONE 7(2). Benartzi, S. & Thaler, R. H. (2001), ‘Naive diversification strategies in defined contribution saving plans’, The American Economic Review 91(1), 79–98. Berg, J. (2016), Income security in the on-demand economy: Findings and policy lessons from a survey of

  • crowdworkers. Conditions of Work and Employment Series No. 74.

Berinsky, A. J., Huber, G. A. & Lenz, G. S. (2012), ‘Evaluating online labor markets for experimental research: Amazon.com’s mechanical turk’, Political Analysis 20(3), 351–368. Beshears, J., Choi, J. J., Laibson, D. & Madrian, B. C. (2018), Behavioral household finance. Working Paper. Camerer, C. F., Dreber, A., Holzmeister, F., Ho, T.-H., Huber, J., Johannesson, M., Kirchler, M., Nave, G., Nosek, B. A., Pfeiffer, T., Altmejd, A., Buttrick, N., Chan, T., Chen, Y., Forsell, E., Gampa, A., Heikensten, E., Hummer, L., Imai, T., Isaksson, S., Manfredi, D., Rose, J., Wagenmakers, E.-J. & Wu,

  • H. (2018), ‘Evaluating the replicability of social science experiments in nature and science between 2010

and 2015’, Nature Human Behaviour 2(9), 637–644. Chandler, J., Mueller, P. & Paolacci, G. (2014), ‘Nonnaïveté among amazon mechanical turk workers: Con- sequences and solutions for behavioral researchers’, Behavior Research Methods 46(1), 112–130. Chandler, J., Paolacci, G., Peer, E., Mueller, P. & Ratliff, K. A. (2015), ‘Using nonnaive participants can reduce effect sizes’, Psychological Science 26(7), 1131–1139. Chandler, J. & Shapiro, D. (2016), ‘Conducting clinical research using crowdsourced convenience samples’, Annual Review of Clinical Psychology 12, 53–81. Coppock, A. (2018), ‘Generalizing from survey experiments conducted on mechanical turk: A replication approach’, Political Science Research and Methods pp. 1–16. Crump, M. J. C., McDonnell, J. V. & Gureckis, T. M. (2013), ‘Evaluating amazon’s mechanical turk as a tool for experimental behavioral research’, PLoS ONE 8(3). DellaVigna, S. (2009), ‘Psychology and economics: Evidence from the field’, Journal of Economic Litera- ture 47(2), 315–372. European Social Survey (2016), ‘Ess round 8 source questionnaire’, London: ESS ERIC Headquarters c/o City University London. Ford, J. B. (2017), ‘Amazon’s mechanical turk: A comment’, Journal of Advertising 46(1), 156–158. Gathergood, J., Mahoney, N., Stewart, N. & Weber, J. (2018), How do individuals repay their debt? the balance-matching heuristic. Working Paper. Goodman, J. E. & Paolacci, G. (2017), ‘Crowdsourcing consumer research’, Journal of Consumer Research 44(1), 196–210. 30

slide-31
SLIDE 31

Goodman, J. K., Cryder, C. E. & Cheema, A. (2013), ‘Data collection in a flat world: The strengths and weaknesses of mechanical turk samples’, Journal of Behavioral Decision Making 26, 213–224. Gorbachev, O. & Luengo-Prado, M. J. (forthcoming), ‘The credit card debt puzzle: The role of preferences, credit access risk, and financial literacy’, Review of Economics and Statistics . Gross, D. B. & Souleles, N. (2002), ‘Do liquidity constraints and interest rates matter for consumer behavior? evidence from credit card data’, Quarterly Journal of Economics 117(1), 149–185. Hara, K., Adams, A., Milland, K., Savage, S., Callison-Burch, C. & Bigham, J. P. (2017), A data-driven analysis of workers’ earnings on amazon mechanical turk. Working Paper. Harrell Jr, F. E., with contributions from Charles Dupont & many others. (2018), Hmisc: Harrell Miscella-

  • neous. R package version 4.1-1.

URL: https://CRAN.R-project.org/package=Hmisc Hauser, D. J., Paolacci, G. & Chandler, J. (2018), Common concerns with mturk as a participant pool: Evidence and solutions. Working Paper. Hauser, D. J. & Schwarz, N. (2016), ‘Attentive turkers: Mturk participants perform better on online attention checks than do subject pool participants’, Behavior Research Methods 48(1), 400–407. Hendriks, A. (2012), Sophie - software platform for human interaction experiments. Working Paper. Herrnstein, R. J. (1961), ‘Relative and absolute strength of response as a function of frequency of reinforce- ment.’, Journal of the Experimental Analysis of Behavior 4(3), 267–272. Hlavac, M. (2018), stargazer: Well-Formatted Regression and Summary Statistics Tables, Central European Labour Studies Institute (CELSI), Bratislava, Slovakia. R package version 5.2.2. URL: https://CRAN.R-project.org/package=stargazer Horton, J. J., Rand, D. G. & Zeckhauser, R. J. (2011), ‘The online laboratory: conducting experiments in a real labor market’, Experimental Economics 14(3), 399–425. Huff, C. & Tingley, D. (2015), ‘“who are these people?” evaluating the demographic characteristics and political preferences of mturk survey respondents’, Research & Politics 2(3), 1–12. Kees, J., Berry, C., Burton, S. & Sheehan, K. (2017), ‘An analysis of data quality: Professional panels, student subject pools, and amazon’s mechanical turk’, Journal of Advertising 46(1), 141–155. Keys, B. J., Pope, D. G. & Pope, J. C. (2016), Failure to renance. Working Paper. Keys, B. J. & Wang, J. (2016), Minimum payments and debt paydown in consumer credit cards. Working Paper. Krupnikov, Y. & Levine, A. S. (2014), ‘Cross-sample comparisons and external validity’, Journal of Exper- imental Political Science 1(1), 59–80. Laibson, D., Repetto, A. & Tobacman, J. (2001), A debt puzzle. Working Paper. Lusardi, A. & Mitchell, O. S. (2011), ‘Financial literacy around the world: an overview’, Journal of Pension Economics and Finance 10(4), 497–508. Lusardi, A. & Tufano, P. (2015), ‘Debt literacy, financial experiences, and overindebtedness’, Journal of Pension Economics & Finance 14(4), 332–368. 31

slide-32
SLIDE 32

McCredie, M. N. & Morey, L. C. (2018), ‘Who are the turkers? a characterization of mturk workers using the personality assessment inventory’, Assessment . Miller, J. D., Crowe, M., Weiss, B., Maples-Keller, J. L. & Lynam, D. R. (2017), ‘Using online, crowdsourc- ing platforms for data collection in personality disorder research: The example of amazon’s mechanical turk’, Personality Disorders: Theory, Research, and Treatment 8(1), 26–34. Mullinix, K. J., Leeper, T. J., Druckman, J. N. & Freese, J. (2015), ‘The generalizability of survey experi- ments’, Journal of Experimental Political Science 2(2), 109–138. Open Science Collaboration (2015), ‘Estimating the reproducibility of psychological science’, Science 349(6251), 943–951. Paolacci, G., Chandler, J. & Ipeirotis, P. G. (2010), ‘Running experiments on amazon mechanical turk’, Judgment and Decision Making 5(5), 411–419. Peer, E., Vosgerau, J. & Acquisti, A. (2014), ‘Reputation as a sufficient condition for data quality on amazon mechanical turk’, Behavior Research Methods 46(4), 1023–1031. Ponce, A., Seira, E. & Zamarripa, G. (2017), ‘Borrowing on the wrong credit card? evidence from mexico’, The American Economic Review 107(4), 1335–1361. R Core Team (2018), R: A Language and Environment for Statistical Computing, R Foundation for Statistical Computing, Vienna, Austria. URL: https://www.R-project.org/ Ramsey, S. R., Thompson, K. L., McKenzie, M. & Rosenbaum, A. (2016), ‘Psychological research in the internet age: The quality of web-based data’, Computers in Human Behavior 58, 354–360. Roulin, N. (2015), ‘Don’t throw the baby out with the bathwater: Comparing data quality of crowdsourcing,

  • nline panels, and student samples’, Industrial and Organizational Psychology 8(2), 190–196.

Snowberg, E. & Yariv, L. (2018), Testing the waters: Behavior across participant pools. Working Paper. Stango, V. & Zinman, J. (2014), ‘Limited and varying consumer attention: Evidence from shocks to the salience of bank overdraft fees’, The Review of Financial Studies 27(4), 990–1030. Stewart, N. (2009), ‘The cost of anchoring on credit-card minimum repayments’, Psychological Science 20(1), 39–41. Thaler, R. (1985), ‘Mental accounting and consumer choice’, Marketing Science 4(3), 199–214. Vulkan, N. (2000), ‘An economist’s perspective on probability matching’, Journal of Economic Surveys 14(1), 101–118. Wolfson, S. N. & Bartkus, J. R. (2013), ‘An assessment of experiments run on amazon’s mechanical turk’, Mustang Journal of Business and Ethics 5, 119–129. Zinman, J. (2015), ‘Household debt: Facts, puzzles, theories, and policies’, Annual Review of Economics 7, 251–276. 32

slide-33
SLIDE 33

Appendix

Additional tables and graphics

Table 6: Logistic regression of the proportion of optimal-behaving subjects Dependent variable: Behavior (0 = non-optimal, 1 = optimal) Whole model (1) Akaike-optimal (2) Minimal model (3) ShowNewDebts −0.740 −0.559 −0.585∗ (0.627) (0.370) (0.355) ShowSavedMoney 0.486 0.596∗ 0.491∗ (0.515) (0.314) (0.297) Financial literacy 0.511∗∗ 0.598∗∗∗ (0.243) (0.127) Years of education (yoe) 0.119∗∗ 0.111∗ (0.061) (0.059) Male 0.265 (0.290) Third gender −13.004 (882.743) Age 0.010 (0.013) #creditcards −0.002 (0.056) #access. creditcards −0.040 (0.119) AtWork 0.012 (0.405) NotUse 0.112 (0.419) AtWork_NotUse 0.985 (0.833) ShowNewDebts · Fin.lit. 0.121 (0.352) ShowSavedMoney · Fin.lit. 0.045 (0.309) Constant −4.445∗∗∗ −5.689∗∗∗ −1.495∗∗∗ (1.212) (1.093) (0.226) Observations 379 404 404 Log Likelihood

  • 165.765
  • 172.711
  • 189.707

Akaike Inf. Crit. 361.530 355.421 385.414 Note:

∗p<0.1; ∗∗p<0.05; ∗∗∗p<0.01

Financial literacy was centralized at a value of 3. 33

slide-34
SLIDE 34

Table 7: Average proportionate repayment on the high-interest credit card per round Roundno. Basic treatment ShowNewDebts treatment ShowSavedMoney treatment 1 0.846 0.805 0.874 2 0.801 0.800 0.845 3 0.808 0.758 0.836 4 0.762 0.752 0.815 5 0.725 0.723 0.844 6 0.669 0.607 0.801 7 0.642 0.654 0.784 8 0.667 0.559 0.768 9 0.667 0.616 0.782 10 0.672 0.611 0.783

Giro HighAPR LowAPR

Basic

0.0 0.2 0.4 0.6 0.8 1.0 Giro HighAPR LowAPR

ShowNewDebts

0.0 0.2 0.4 0.6 0.8 1.0 Giro HighAPR LowAPR

ShowSavedMoney

0.0 0.2 0.4 0.6 0.8 1.0

Figure 7: Average proportion of payments in one round 34

slide-35
SLIDE 35

Experiment screens - basic treatment

Figure 8: Welcome screen Figure 9: Compensation scheme information 35

slide-36
SLIDE 36

Figure 10: Experiment instructions Figure 11: Comprehension task 1 Figure 12: Comprehension task 2 36

slide-37
SLIDE 37

Figure 13: Comprehension task 3 Figure 14: Attention test 1 Figure 15: Info trial rounds 37

slide-38
SLIDE 38

Figure 16: Repayment screen (trial, in round 1 of 2) Figure 17: Info screen for the next trial round 38

slide-39
SLIDE 39

Figure 18: Info screen after the last trial round Figure 19: Payoff screen (trial, example) Figure 20: Start of the main rounds 39

slide-40
SLIDE 40

Figure 21: Repayment screen (main rounds, example of round 4) Figure 22: Info screen of the main rounds 40

slide-41
SLIDE 41

Figure 23: Info screen after the last main round Figure 24: Payoff screen (main rounds, example) Figure 25: Info screen for the questionnaire 41

slide-42
SLIDE 42

Figure 26: Asking for the strategy Figure 27: Asking for owned credit cards Figure 28: Asking for accessable credit cards 42

slide-43
SLIDE 43

Figure 29: Additional credit card questions Figure 30: Financial literacy question 1 Figure 31: Financial literacy question 2 43

slide-44
SLIDE 44

Figure 32: Financial literacy question 3 Figure 33: Financial literacy question 4 Figure 34: Attention test 2 44

slide-45
SLIDE 45

Figure 35: Financial literacy question 5 Figure 36: Financial literacy question 6 Figure 37: Demographic questions 45

slide-46
SLIDE 46

Figure 38: Comment

Experiment screens - differences in other treatments

Figure 39: ShowNewDebts-treatment - Info screen before repayments 46

slide-47
SLIDE 47

Figure 40: ShowNewDebts-treatment - Repayment screen (main round 1) Figure 41: ShowSavedMoney-treatment - Info screen before repayments 47

slide-48
SLIDE 48

Figure 42: ShowSavedMoney-treatment - Repayment screen (main round 1) 48