Joint Pricing and Inventory Control under Reference Price Effects

In many finns the pricing and inventory control functions are separated: the marketing department detennines optimal prices first and then logistics decides on optimal stocking quantities, taking demand as exogenous and only considering incremental costs. However, a number of theoretical models suggest a joint detennination of inventory levels and prices, as prices also affect stocking risks. In this work, we address the problem of simultaneously detennining a pricing and inventory replenishment strategy under reference price effects. This reference price effect models the empirically well established fact that consumers not only react sensitively to the current price, but also to deviations from a reference price fonned on the basis of past purchases. The current price is then perceived as a discount or surcharge relative to this reference price. Thus, immediate effects of price reductions on profits have to be weighted against the resulting losses in future periods. We study how the additional dynamics of the consumers' willingness to pay affect an optimal pricing and inventory control model and whether a simple policy such as a base-stock-list-price policy holds in such a setting. For a one-period planning horizon we analytically prove the optimality of a base-stocklist-price policy with respect to the reference price under general conditions. We then extend this result to the two-period time horizon for the linear and loss-neutral demand function and to the multi-period case under even more restrictive assumptions. However, numerical simulations suggest that a base-stock-list-price policy is also optimal for the multi-period setting under more general conditions. We furthennore show by numerical investigations that the presence of reference price effects decreases the incentive for price discounts to deal with overstocked situations. Moreover, we find that the potential benefits from simultaneously detennining optimal prices and stocking quantities compared to a sequential procedure can increase considerably, when reference price effects are included in the model. This makes an integration of pricing and inventory control with reference price effects by all means worth the effort. Lisa Gimpl-Heersink 978-3-631-75380-4 Downloaded from PubFactory at 01/11/2019 05:41:29AM via free access Lisa Gimpl-Heersink 978-3-631-75380-4 Downloaded from PubFactory at 01/11/2019 05:41:29AM via free access

In this work, we address the problem of simultaneously determining a pricing and inventory replenishment strategy under reference price effects. This reference price effect models the fact that consumers not only react sensitively to the current price, but also to deviations from a reference price formed on the basis of past purchases. Immediate effects of price reductions on profits have to be weighted against the resulting losses in future periods. By providing an analytical analysis and numerical simulations we study how the additional dynamics of the consumers' willingness to pay affect an optimal pricing and inventory control model and whether a simple policy such as a base-stock-list-price policy holds in such a setting.

Abstract
In many finns the pricing and inventory control functions are separated: the marketing department detennines optimal prices first and then logistics decides on optimal stocking quantities, taking demand as exogenous and only considering incremental costs. However, a number of theoretical models suggest a joint detennination of inventory levels and prices, as prices also affect stocking risks. In this work, we address the problem of simultaneously detennining a pricing and inventory replenishment strategy under reference price effects. This reference price effect models the empirically well established fact that consumers not only react sensitively to the current price, but also to deviations from a reference price fonned on the basis of past purchases. The current price is then perceived as a discount or surcharge relative to this reference price. Thus, immediate effects of price reductions on profits have to be weighted against the resulting losses in future periods. We study how the additional dynamics of the consumers' willingness to pay affect an optimal pricing and inventory control model and whether a simple policy such as a base-stock-list-price policy holds in such a setting.
For a one-period planning horizon we analytically prove the optimality of a base-stocklist-price policy with respect to the reference price under general conditions. We then extend this result to the two-period time horizon for the linear and loss-neutral demand function and to the multi-period case under even more restrictive assumptions. However, numerical simulations suggest that a base-stock-list-price policy is also optimal for the multi-period setting under more general conditions. We furthennore show by numerical investigations that the presence of reference price effects decreases the incentive for price discounts to deal with overstocked situations. Moreover, we find that the potential benefits from simultaneously detennining optimal prices and stocking quantities compared to a sequential procedure can increase considerably, when reference price effects are included in the model. This makes an integration of pricing and inventory control with reference price effects by all means worth the effort. 3. Optimal expected profit in reference price and inventory (loss-neutral) 82 6.4. Optimal inventory in reference price and inventory (loss-neutral) . . 83 6.5. Optimal price in reference price and inventory (loss-neutral) . . . . . 83 6.6. Base-stock in inventory level for different time periods (loss-neutral) . 84 6.7. Optimal price in inventory level for different time periods (loss-neutral) 84 6.8. Base-stock in reference-price for different time periods (loss-neutral) . 85 6.9. List-price in reference-price for different time periods (loss-neutral) . . 18. Sequential optimization of price and inventory vs. joint optimization 92 6.19. Price path (sequential vs. joint optimization) . . . 94 6.20. Inventory path (sequential vs. joint optimization) 94 6.21. Expected profit increase over time . . . . . . 95 6.22. Expected profit increase in inventory level x0 95 6.23. Expected profit increase in reference effect . . 96 6.24. Expected profit increase in reference price r 1 96 6.25. Optimal inventory in reference price and inventory (loss-averse) 98 6.26. Optimal price in reference price and inventory (loss-averse) . . . 98 6.27. Base-stock in reference price for different time periods (loss-averse) 99 6.28. List-price in reference price for different time periods (loss-averse) . 99 6.29. Optimal inventory in reference price and inventory (loss-seeking). . 100 6.30. Optimal price in reference price and inventory (loss-seeking) . . . . I 00 6.31. Base-stock in reference price for different time periods (loss-seeking) IOI 6.32. List-price in reference price for different time periods (loss-seeking) . 101 6.33. Optimal inventory in reference price and inventory (incl. fixed costs) . 102 6.34. Optimal price in reference price and inventory (incl. fixed costs) . . . 102 6.35. Base-stock in inventory level for different time periods (incl. fixed costs) . 103 6.36. Optimal price in inventory level for different time periods (incl. fixed costs) I 03 1. Introduction

Problem description
Recent years have witnessed increased interest on the part of retail and manufacturing companies in investigating innovative pricing strategies in order to boost their operations and bottom line. In the past, e.g. grocery, drug or fashion apparel stores would fix a product's price over a relatively long time period and mainly focus on their inventory management in order to obtain a better match between supply and demand. This static pricing strategy was mainly due to the lack of information about their customers' taste, willingness to pay and the fact that high transaction costs -so-called menu costs -were associated with changing prices. Driven in large part by advances in information technology and e-commerce, a more sophisticated approach of changing a product's price found its way into retail and manufacturing industries. Here, the seller changes prices dynamically over time, based on factors like demand information, supply availability, production schedules and the time of sale. With the goal of balancing demand and supply, dynamic pricing methods were first applied by industries where the short term capacity is hard to change, such as airlines, hotels, cruise ships, etc. (see Talluri and van Ryzin (2004) for more detail). Nowadays, the business model of dynamically changing the prices of a product is an important revolution in retail and manufacturing industries and is already strongly practiced by e.g. Dell Computers and Amazon. There is growing understanding that both pricing as well as replenishment decisions are essential for increasing a firm's profitability and thus should be coordinated. Nevertheless they are traditionally mostly determined by separate functional areas of a company's organization: the marketing department sets prices, the market determines the quantity demanded, and the logistics unit produces the quantity demanded. However, research work such as Whitin ( 1955) has already shown that the simultaneous determination of price and ordering or production quantity can yield substantial revenue increases. The coordination of price decisions and other aspects of the supply chain such as production and distribution is thus not only useful, but also essential. Coordinating these decisions means optimizing the system rather than its individual elements and not only potentially increases profits but also reduces variability in demand or production, resulting in more efficient supply chains. Enabled by powerful IT systems that can store and estimate thousands of demand models and compute integrated optimal policies today, reengineering efforts are being initiated in many companies to eliminate the organizational barriers between distinct functional areas within the same enterprise by creating new entities with such designations as 'Revenue Management', 'Dynamic Pricing' or 'Smart Pricing'.

Research intention
Looking at the state-of-the-art methodological literature, we find that relevant work divides into two rather distinct streams: The operations oriented stream (see chapter 2) and the marketing oriented stream (see chapter 3). Eliashberg and Steinberg ( 1993) give a nice comparison of the two streams: Operations management (or production management) deals with organizing and controlling the direct resources to produce the goods and services provided by an organization to customers. Marketing in contrast deals with the process of planning and executing pricing, promotion and distribution of goods and services in order to create exchanges that satisfy individual and organizational objectives. The interface between marketing and operations management is being recognized as a legitimate research domain and has experienced increased emphasis in the past. Nevertheless, as already stated above, in most firms the marketing and production functions are organizationally separate. A possible explanation could be that marketing is typically concerned with revenue maximization by setting prices and advertising policies. Here, relatively realistic demand models are being used, which for example account for intertemporal demand correlations by incorporating both current price and reference price, which is formed on the bases of past purchases. However, they underlay a rather simplistic cost structure which does not account for supply chain management interactions by e.g. assuming stationary variable costs. Operations management is typically concerned with cost minimization, meaning that production is required to produce the needed output at minimum costs. Thus rich cost models, well describing a firm's possible cost structure, are being used. Costs are assumed to be non-stationary, which means that they can vary over time and fixed costs can be in included in the model. Furthermore, production decisions are integrated in the model (not only pricing but also inventory decisions), which is not the case in purely marketing-orientated work. The limitation of these models is that they rely on rather simplistic demand assumptions. Demand is, for example, modeled as a function of the current price only. In any case, both prevalent research streams consider only a partial picture of the relevant system. Typically, a coordinated decisionmaking problem results in better performance of the system. The magnitude of the improvement depends on how the objective functions are defined for the two separate departments and which department is assumed to act first.
Identifying this prevailing research gap leads us to address the problem of simultaneously determining a pricing and inventory replenishment strategy by combining these two literature streams described above: we want to take the rich and non-stationary cost models commonly used in operations research and combine them with demand models, which account for intertemporal demand correlation and so far have been mainly applied by marketing. Both price and ordering quantity are to be dynamically adjusted according to the prevailing inventory, the consumers' willingness to pay and the remaining length of the finite selling horizon. The integration of reference price effects with inventory control models has not been reported so far in literature. Hence, by developing such an integrated inventory control and pricing model, we will probe into the issue of whether using a reference price model to describe demand will significantly increase the benefits of integrating marketing and logistic Lisa Gimpl-Heersink -978-3-631-75380-4 Downloaded from PubFactory at 01/11/2019 05:41:29AM via free access

STRUCTURE OF THE THESIS
15 decisions and when it makes sense to apply such models. In this thesis we generally focus on linear demand models, which are detrended and seasonally adjusted. Furthermore, we only consider monopolistic pricing, ensuring mathematical tractability. These assumptions are not unrealistic, because price optimization by a firm is only possible in imperfect markets.
In the case of monopolistic competition a firm faces a range of prices where competitors do not react. The linear demand function is a local approximation conditional on competitor's prices which remain unchanged if the price stays within this permissible range (see Phillips (2005), Chapter I). Not only are these models important in retail, where price-dependent demand plays a significant role, but also in manufacturing environments with a different underlying cost structure, in which production and distribution decisions can be complemented with pricing strategies in order to improve the firm's bottom line. Within this work we are going to examine how the additional dynamics affect an optimal policy and whether variants of a simple policy such as a base-stock-list-price policy still hold in such a setting. Furthermore, we are going to find conditions under which it is possible to show analytically the existence of a unique optimal solution. We want to state here that the main focus of this dissertation is a mathematical analysis, which justifies that most problem definitions are taken from literature. However, we will still try to motivate an economic understanding of dynamic market models and supply chain decisions, wherever possible. Via numerical simulation we shall explore the size of potential benefits of such models, as well as how optimal policies evolve over time and how optimal solutions vary with changes in the model parameters.

Structure of the thesis
We will here give a short outline of the structure of this thesis. Chapter 2 and chapter 3 are devoted to a brief review of the current state-of-the-art literature, relevant to this work, as well as some minor new results. The main new results will be presented in chapters 4 to 6.
Chapter 2 gives an overview of the models used in operations research so far. For didactical reasons we first introduce the theory of solely inventory control models in section 2.2, which are then expanded to the multi-period setting in section 2.3. For each of the two sections, we first focus on one-period models, which are then extended to the multi-period setting. We not only present the well known critical fractile solution for the classical lostsales version of the newsvendor problem, but also adapt the solution to the backlogging case including inventory holding and backlogging costs. Furthermore, the base-stock-list-price policy is introduced in chapter 2 and shown to be optimal for the most commonly used demand models. We also provide a steady-state solution for the joint pricing and inventory control model in subsection 2.3.3, which has not been seen in literature so far.
Chapter 3 is devoted to marketing models that mainly focus on price optimization. The concept of reference price effects is introduced and structural properties of the optimal solutions are given for loss-neutral and loss-averse customer behavior. We show by a numerical example that for loss-seeking customer behavior, the optimal solution does not converge and thus a cycling pricing policy is optimal. As in chapter 2, we provide a steady-state solution for the case of non-zero proportional ordering costs, which is an extension to the solution found by Popescu and Wu (2007).
In chapter 4 we combine the two models presented in chapter 2 and chapter 3 and introduce an integrated model including reference price effects, which will lay the foundation for the rest of this work.
Chapter 5 is dedicated to an analytical analysis of the model introduced in chapter 4. This chapter consists of three parts: the one-period case, the two-period case and the multi-period case. For the one-period case in section 5.1, we can prove the optimality of a base-stock-listprice-policy and provide implicit solutions for the optimal price and stocking quantity with respect to reference price under very general conditions. However, it is not so easy to extend this property to a multi-period setting. By integrating the solution of the one-period case into section 5.2, we find that for the linear demand a base-stock-list-price policy also holds for the two-period case. The mathematics behind this result is extensive and tedious, which is why we chose to present purely the technical results in the appendix A. In section 5.3 we prove the optimality of a base-stock policy under rather restrictive assumptions. Adjusting the proof technique for a more general setting is definitely worthwhile considering for further research.
Chapter 6 is devoted to simulations and numerical investigations. By the means of numerical optimization, in section 6.1 we extend the results from section 5.2 to the multi-period setting for the special case of linear demand and loss-neutral customer behavior. We furthermore investigate the influence of different demand distributions and coefficients of variations. In section 6.2, we study the potential increase of profit by simultaneously determining optimal prices and stocking quantities compared to a sequential optimization, where prices are set first by the marketing department of a company and then the production unit decides on the optimal stocking quantity, without being able to change prices. In section 6.3, we provide some numerical results for loss-averse and loss-seeking customer behavior and the case of non-zero fixed ordering costs.
The last and concluding chapter 7 of this thesis provides an overview of conclusions and recommendations for further research.

Models in Operations Research Literature
Operations research has a significant impact on inventory management in recent decades. The theory of inventory management deals with the management of stock levels of goods, with the intent of effectively meeting demands for those goods. Traditional inventory models (see sections 2.2.1 and 2.3.2) assume that a commodity's price is exogenously determined and thus only address the two fundamental issues: when should a replenishment order be placed, and what quantity should be ordered. Hence the objective is to minimize costs. Recent developments in the area of revenue management have demonstrated that major benefits can be derived by complementing a replenishment strategy with the dynamic adjustment of the the commodity's price (see sections 2.2.2 and 2.3.3). Since demand for a product varies as a function of price in practice (see e.g. Phillips (2005)), the objective therefore changes from minimizing costs to maximizing profits under dynamic pricing strategies. In the presence of demand uncertainty, a common approach for risk neutral companies is to minimize expected costs or maximize expected profits. Alternative risk averse approaches using e.g. Value at Risk measures instead of expected values can be found in literature, but are not the focus of this thesis. The complexity of the model depends on the assumptions, one makes about demand and the underlaying cost structure.
According to Porteus (1990) and Lee and Nahmias ( 1993), there are several reasons for holding inventories: The key motive is definitely to hedge against uncertainty in the face of stochastic demand. Holding stocks in response to this unpredictable variability means higher holding costs but lower shortage costs, which are in general significantly higher than holding costs. Moreover economies of scale are an important reason for keeping inventories. Economies of scale occur when there is a fixed setup cost (e.g. setup time, changeover time, etc.) for each order that does not depend on the lot size and often arises when there are quantity discounts or learning. Last but not least, it may be advantageous to retain inventories in anticipation of a price rise. Inventories may also be stockpiled in advance of sales increases. If demand is expected to raise, it may be more economical to build up large inventories in advance, rather than to increase production capacity at a future time. However, large build-ups of inventory are often a result of poor sales.

Problem description
Consider a retailer or manufacturer who maintains an inventory of a particular product. Since customer demand is random, the decision maker only has a vague idea about the actual demand occurring at a given time. This information is described by a probability distribution of demand. Depending on this knowledge, the retailer or manufacturer has to decide at what point to reorder or produce a new batch of products and in succession how many items of the product the batch should comprise. Typically, such reordering decisions involve two different kinds of costs: a fixed amount, independent of the size of the order (e.g. cost of sending a vehicle from the warehouse to the retailer), and a variable amount proportional to the number of products ordered. In the face of uncertainty about the actual demand, this decision will generally lead to over-or underproduction, with resultant excess inventories incurring unnecessary holding costs (typically accruing at a constant rate per unit of product by unit of time), or inability to meet consumer needs, respectively.
The literature shows two ways of coping with unmet consumer demands: either the lost sales case where demand that cannot be met immediately is lost forever, or the backlogging case, where demand for the product in excess of the amount stocked will be backlogged. This means that these customers will return next period for the product, in addition to the usual (random) number of customers who generate demand then. The inability to meet consumer needs when they occur results in potentially long term loss of customers for which artificial penalty costs called backlogging costs will be charged. The decision maker has to determine an optimal inventory policy to minimize the expected cost of ordering and holding inventory. In some situations, especially the one of interest in our work, the price at which the product is sold to the customer is also a decision variable. In this case demand is not only random but is also affected by the selling price. The retailer's or manufacturer's objective is thus to find an inventory and pricing strategy maximizing expected profits over the length of the planning horizon.
In this work we mainly focus on the retailing environment, where inventory decisions represent ordering decisions. However, the same argumentation can be expanded to the manufacturing setting, where inventory decisions become procurement decisions under a different cost structure, respectively.   From this concavity property it becomes clear that y• is the unique optimum when x < y* . Moreover, concavity in y ensures that it is optimal not to place an order ( y• = x ), if x 2: y•, because the expected profit is strictly decreasing for any y > x 2: y* . In the literature, such a policy is often called a base-stock policy.
Definition 2.1 (Base-stock policy). A base-stock (order-up-to) policy is characterized by an order-up-to level, often referred to as the base-stock level s• . If the initial inventory level before ordering x is below the base-stock level, an order is placed to raise the inventory level up to the base-stock level. Otherwise, no order is placed (figure 2.1 gives a graphical description of such a policy): (2.2.7) For later investigations, some slightly different notation will be convenient, which we thus introduce here: When uncertainty in demand is modeled additively as where p denotes the unit selling price, c the unit ordering costs with O < c < p, s 2: 0 the unit shortage costs, and v < c the unit salvage value.
Remark 2.1. Cu = (p + s -c) represents the opportunity cost of underestimating demand and C0 = ( c -v) the cost of overestimating demand. The above ratio Cu/ ( C,, + C0 ) in equation (2.2.9) is known as the critical fractile. Intuitively, it corresponds to the safety factor at which the expected profit lost from being one unit short is equal to that from being one unit over.
Logistics employs a different model if demand in excess of the amount stocked is backlogged. In this case, customers will return after the end of the period where there is one more chance to place an order for the outstanding items, which are then instantaneously delivered to the customers (see e.g. Porteus ( 1990) for newsvendor models with partial backlogging or Khouja (1996) for newsvendor models with an emergency supply option). But, at the same time, backlogging costs b 2: 0 , are charged as penalty costs for the inability to meet consumer needs when they occur. In case of overproduction with resultant excess of inventories at the end of the period, holding costs h 2: 0 occur. These could be interpreted as carrying charges until the remaining items can be sold, e.g. to a discount store for some salvage value v . Holding and backlogging costs are charged for the period when they occur, whereas any financial flow after the end of the period (reordering/salvaging opportunity) is discounted by a discount factor O < 'Y :S 1 . A brief summary of the newly introduced variables is given in table 2.1. To insure that it is not optimal to not order anything at all and merely accumulate backlog penalty costs, b > ( 1 -'Y )c is also assumed. Note that since holding costs and salvage value always occur together ( h -"(V) and backlogging costs are always associated with ordering costs after the end of the period ( b + "(C ), they both could be integrated in one variable each, which is usually the case in literature. However, we here choose to keep them both in preparation for the multi-period inventory model. Maximizing expected profits yields the following theorem. (c)] being the mean demand and E a random variable with mean zero and fallowing a cumulated distribution function F ( ·) . Then for the backlogging case an orderup-to policy with the optimal base-stock level S* is also given by

Theorem 2.2. Let demand D(c) again be modeled additively, such that
where p denotes the unit selling price, c the unit ordering costs with O < c < p, "( the discount factor with O < 'Y :S 1, b > (l -1)c the unit backlogging costs, h 2: 0 the unit holding costs and v < c the unit salvage value.
Lisa Gimpl-Heersink -978-3-631-75380-4 Downloaded from PubFactory at 01/11/2019 05:41:29AM via free access Proof Let u denote the realization of the random variable E and differentiate the expected one period profit (2.2.11) with respect to the inventory y. This leads to • Remark 2.2. Note that the above result can only be obtained for backlogging costs b being independent of price p.
Remark 2.3. The critical fractile (b-(l -1 )c)/(h +b-1( v-c)) has a similar interpretation as in equation 2.2. 9: it corresponds to the order quantity at which the expected profit lost from being one unit short is equal to that from being one unit over. Here C,, denotes the opportunity cost of underestimating demand and C0 = ( h + c -'")'V) the cost of overestimating demand. The above ratio is again given by Cu/( C,, + C0 ) . Note that in the backlogging case p does not appear in the critical fractile. This is because the items are sold in any case. Today, the situation characterized by formula (2.2. IO) is more prevalent in practice, as due to competition, firms are more willing to incur substantial backlogging costs than to lose customers in the future due to unsatisfied demand.

Joint pricing and inventory control
We now apply the newsvendor problem to analyze firms who jointly set a selling price and a stocking quantity prior to facing the random demand in a single period. Such an extended model incorporates the price as a decision variable which provides an excellent vehicle for examining how operational problems interact with marketing issues to influence decision making at the corporate level. One of the first attempts to address marketing-production joint decision making was presented by Whitin ( 1955), who formulated a news vendor model with price effects. He adapts the model described in the section 2.2.1 in such a way that the probability distribution of demand depends on the selling price, where price is a decision variable rather than an external parameter, and found that a gain can be achieved by more closely coordinating marketing and logistics. A good survey on price setting newsvendor models can be found in Petruzzi and Dada ( 1999 (2.2.14) Petruzzi and Dada (l 999) show by simple differentiation (as we did in section 2.2.1) that E[II (z,p, c)] is concave in p for a given z, which guarantees that the sequential process of first optimizing p for a given z and then searching over the resulting optimal trajectory to maximize E[II (z,p(z), c)] in the safety stock z yields the optimal solution. In this case the optimal price for the integrated problem is given by (see Petruzzi and Dada (1999) for details) p*(z) = P° + ~~), (2.2.15) where p 0 denotes the optimal riskless price (2.2.16) which is obtained by differentiating the marketing goal function (p -c)E[D(p,c)] with respect to p and setting the result equal to zero (Phillips (2005), Chapter 1)). Furthermore, du denotes the expected lost sales when a safety stock z is chosen.
Since /31 < 0 and 0(z) is nonnegative this theorem follows: Theorem 2.3. In the lost sales case the optimal risk less price p 0 is higher than the optimal price p* incorporating risk.
Remark 2.4. In the integrated setting the price is used to reduce the coefficient of variation of demand, and the difference between the optimal price set by marketing in isolation is decreasing with increased price sensitivity (slope of the demand function /31 ) and demand uncertainty. However, as Petruzzi and Dada ( 1999)   Theorem 2.4. In the backlogging case, the optimal price p* for the integrated problem is the same as the optimal price p 0 obtained by the sequential method: Remark 2.5. In the backlogging case, we can set the price independently of the inventory decision. Thus, while the sequential approach is not optimal in the lost-sales case, no gain is achieved by joint optimization in the backlogging case, which is, as already stated above, more prevalent in practice.

Multi-period models
We are now ready to consider the finite horizon multi-period version of the problem setting described in the last section, which was first introduced and solved by Arrow et al. (1951 ). The backlogging version of the system described in section 2.2 will now be operated over T periods. What makes the problem more complicated than solving T copies of the singleperiod problem is that any leftover stock at the end of one period is retained and can be offered for sale the following period (see figure 2.2 for a sample inventory path). The inventory level x 1 is reviewed at regular intervals (e.g. each week or month), an appropriate quantity y1 -x1 is ordered and a per unit selling price Pt charged after each review at the beginning of a new period t . For easier tractability and clarity of the formulas we assume that all input variables are stationary and thus not anticipated to change over time (most of Lisa the results presented in the following also hold in the time variant case). Each unit of positive leftover stock at the end of each period incurs holding costs h . If the demand exceeds the inventory on hand, then the additional demand is backlogged and is filled when the additional inventory becomes available -the backlogged units are viewed as negative inventory. This means that these customers will return the next period for the product, in addition to the usual random number of customers. A per unit backlogging cost b, b > 1 -"(C is charged as a penalty cost. The newly arising demands Dt in different periods are assumed to be statistically independent and identically distributed according to general stochastic demand functions as in the above section. The ordering cost function includes both a per unit variable cost c and a fixed setup cost k, which is incurred if an order is placed ( Yt > Xt ), regardless of the size. If no order is placed, no setup costs are incurred. Orders placed are essentially received immediately (received in time to meet demand that arises in that period). All costs are expressed in beginning-of-period cash units; cash flows occurring in subsequent time periods are discounted by a one-period discount factor 'Y E (0, 1]. After the last period, the remaining inventory is salvaged at a per unit salvage value v or backlogged demand is satisfied and thus a final order is placed. For an overview of the variables introduced in this section we refer the reader to table 2.2 at the end of this section. The objective of the dynamic version of the backlogging inventory model is to maximize total expected discounted profits V(xi), when the initial inventory on stock before ordering at the beginning of the planning horizon is x 1 : If the demand distribution functions are discretized, according to Jung et al. (2004) the evolution of demands over time can be represented by a tree-like structure (see figure 2.3). Starting from each node, there can be several possible demand realizations, expressed as branches stemming from that node. Assuming m possible next-period demand realizations at each node, the total number of scenarios will amount to m T . At each period t each node is associated with the realization of demand, the decision variables and the state variables.
Complete enumeration would amount to an exponential complexity of O(mT), where 0(-) denotes the Big O notation, which describes the runtime complexity of an algorithm. Therefore a stochastic dynamic programming approach with the significantly lower complexity of O(Tm) is described in the following to model the planning process as it reacts to demand realizations unfolding over time. Dynamic programs deal with situations where decisions are made in stages. In a stochastic setting dealing with random parameters, the outcome of each decision is not fully predictable but can be anticipated to some extent before the next decision is made. The objective in dynamic programs is either to minimize or to maximize a certain value which can be expected costs or profits, respectively. A key aspect of such environments is that decisions cannot be viewed in isolation since one must balance the desire for high (respectively low) present values against the undesirability of low (respectively high) future values. The technique of dynamic programming captures this tradeoff. At each stage it ranks decisions based on the sum of the present values and the expected future values, assuming optimal decision making for subsequent stages. The states of the system summarize past information that is relevant for future optimization.
The principle of dynamic programming was popularized by Richard Bellman in the forties and is to decompose such a complicated problem into a sequence of equivalent single period problems. One need only specify the optimal value of starting the next period (as a function of the starting state) and continue over the remainder of the planning horizon as the 'salvage' value function. In the case of dynamic models, it usually amounts to working backwards. A good review on how stochastic dynamic programming models, also referred to as Markov decision processes or stochastic control problems, apply to economic literature can be found in e.g. Stokey et al. (1989), Puterman (1994), Porteus (2002), Miranda and Fackler (2002), Heyman and Sobel (2004) and Bertsekas (2005).
Dynamic programing using backward recursion will be an appropriate technique for solving the above multi-period maximization problem. Thus, equation ( where the value function ½(x) denotes the maximum expected discounted profit for periods t, . .. , T (profit-to-go function) when starting period t with initial inventory level Xt and Equation (2.3.6) gives the gross quantity of stock on hand at the beginning of period t, which equals the inventory on hand after ordering at the beginning of the previous time period less the total quantity actually sold during that period (see figure 2.2). A brief idea of the system dynamics is given in figure 2.4.
In the study of stochastic dynamic programming models, researchers often attempt to establish certain structural properties of the value function in the state variables, like for instance monotonicity, convexity or supermodularity. Properties such as convexity, can be enough to specify the general form of the optimal policy. Establishing the existence of optimal policies with a special structure is of great practical importance, since they are highly appealing to decision makers, are easy to implement and enable efficient computation. In such cases specialized algorithms can be developed to search only among policies that have the same form as the optimal policy, which speeds up computation time significantly. When the optimality of e.g. monotone decision rules is known, efficient backward induction algorithms (see Puterman (1994), section 4.7.6) can be developed by constantly restricting the action space. Furthermore, such properties canhelp in developing a qualitative understanding of the model by describing how the results will change with changes in the model parameters. For a good review and general results on structural properties of stochastic dynamic programs we refer the reader to e.g Smith and McCardle (2002), Puterman (1994 ), Topkis (1998), Bertsekas (2005), Bertsekas (2001) and Heyman and Sobel (2004).
The following two subsections will be devoted to review of some types of simple forms of optimal policies that have already been found in literature and also provide some intuitive understanding of the structural results, which will be useful for a better understanding of the integrated model in chapter 5 and 6. Furthermore, we will include a brief convergence analysis. Bellman and Glicksberg (1955) were the first to show that the optimal total cost function is convex in inventory on stock before ordering for certain stationary assumptions, which means that a constant stock level is optimal (often referred to as base-stock level, see definition 2.1 on page 21 ). Wagner and Whitin ( 1958) presented a nice forward algorithm for a solution of the dynamic version of the economic lot size model. In the following we give the reader an idea of how structural properties are maintained by induction from one time period to the next and so lead to a base-stock policy. Since in this section we focus on solely optimizing the inventory level Yt in each time period, we let demand be exogenously given by D1 =
To prove that a base-stock policy is optimal, we establish the following lemma, which can be found on page 525 in Heyman and Sobel (2004) and g(x, y) < oo for every x E X, then f is a concave function on X.
Theorem 2.5. Let a multi-period inventory control model be given by the dynamic program defined in equation (2.3. 7), whereby v = c is assumed and thus Vr +1 ( x) = ex. Then the following holds for any time period t = 1 ... T: I. J1 ( x, y) is jointly concave in x and y .

½(x) is concave in x.
3. A base-stock policy with order-up-to level s; is optimal in time period t.
Proof The proof follows the principle of induction. By applying Leibnitz' integration rule we find -G(y) to be concave in y: (2.3.11) Since -G(y) does not depend on x we can also say that -G(y) is jointly concave in x and y. Moreover, Vo(x) Time Period t ]y(x, y) being jointly concave furthermore yields the optimality of a base-stock policy (see By the same argumentation as used above for t = T, J1(x, y) is then shown to be jointly concave in x and y, which yields that ½(x) is concave in x. Thus we showed that base-stock policy is optimal for any time period t .
• Remark 2.6. A terminal value V0 (x) = ex means that leftover units at the end of the planning horizon can be salvaged at same costs for which they were originally bought. This assumption is common in literature since it guarantees an easy analytical tractability. Figure 2.5 shows that for the finite horizon case with no salvage value ( v = O ), the optimal base-stock level decreases over time. That is because towards the end of the planning horizon, since time remaining is getting shorter, the risk of not selling the inventory on stock increases, against which costs the decision maker hedges by a diminishing base-stock level.
Of course there is no risk in the case where the per unit salvage value equals the per unit ordering costs ( v = c ). In this case, if some inventory on stock is not sold by the end of the planning horizon, it can be salvaged by the same amount of money as it was ordered.
Here, a myopic policy which looks only at the single period backlogging problem described in subsection 2.2.1 is optimal in every period, regardless of the time horizon T (see Veinott (1965) where x 1 denotes the starting inventory level at the beginning of the time-horizon. We now for all t 2: 1 and then rearrange the sum in such a way that we are left only with terms indexed by t + I in the t -th summand: t=O Yt+t~Xt+l (2.3. I 3) As we assume that we are in a steady state we replace the time dependent y1 by in time invariant y. Since the profit function V ( x 1 ) is a concave function in y, the optimal inventory level after ordering y* can now be obtained by differentiating V(xi) with respect to y and setting the result equal to zero. Using (2.3.10) we thus obtain which results in equation (2.3.12). Remark 2.8. The steady state base-stock level S:X, is increasing in both discount factor 1' and backlogging costs b, since F-1 (·) is increasing in these parameters. This is intuitive, because the seller, by keeping higher inventory levels, wants to hedge against higher backlogging costs, or reordering costs in a subsequent time period, respectively. Furthermore, a higher demand uncertainty also results in higher safety stock levels and hence in higher base-stock levels. Numerical results also show that for more heavy tailed distributed demands (like the beta or the log-normal distribution) base-stock levels are higher to prepare for the higher risk of large demands. Scarf ( 1960) and Veinott ( I 966) later extend the above theory to the case of nonzero fixed ordering costs. They prove that the optimal total cost function is k-convex 1 (under the assumption of convex holding/ shortage costs), inducing that the optimal policy in each period is an ( s, S) -type policy: If the inventory level at the beginning of the period t is below the reorder point, St , an order is placed to raise the inventory level to the order-up-to level, S1 . Otherwise no order is placed. Since we are not focusing on the case of nonzero fixed ordering costs in this thesis, we will omit further details here. Moreover, we just state for the matter of completeness that Zheng ( 1991) gives a simple proof for the optimality of an ( s, S) -policy for the infinite horizon case, which does not depend on results of the finite-horizon problem (like an earlier proof conducted by Iglehart ( 1963)).

Joint pricing and inventory control
One of the first attempts to address marketing-production joint decision-making was presented by Whitin ( 1955), who for a multi-period approach used a deterministic model. Thomas ( 1970) then extended the famous Wagner and Whitin ( 1958) forward algorithm to the marketing-production domain where price is included as a decision variable (still in a deterministic setting). In a subsequent paper, Thomas (1974) considers a stochastic version of his model. There he considers the problem of of jointly setting price and production levels in a series of T periods, where price is modeled as a parameter in the probability distribution of demand. He is the first to formulate the problem as a dynamic program from which a optimal policy was derived numerically. Following the work of Porteus ( 1982) and Gallego and Van Ryzin (1994), Federgruen and Heching (1999) prove, assuming that the underlying demand function is linear and that the ordering cost is proportional to the amount ordered and thus does not include a fixed cost component, that a base-stock-list-price policy is optimal. That is, in each period the optimal policy is characterized by an order-up-to level, referred to as the base-stock, and a price which depends on the initial inventory level at the beginning of the period. If the initial inventory level is below the base-stock level, an order is placed to raise the inventory level to the base-stock level and the ordinary price (the list price) is charged. Otherwise, no order is placed and a discount price is offered, which is a non-increasing function of the initial inventory.
Lisa tractability consider the special case of zero fixed ordering costs ( k = 0 ). We thus define the underlying dynamic program as follows: (2.3.15), whereby v = c is assumed and thus Vr+ 1 (x) = ex. Then the following holds for any time period t = 1 ... T: I. lt(x,y,p) isjointlyconcavein x and y and p.

A base-stock policy with order-up-to level s; is optimal in time period t.
Proof. The proof technique is identical to the one suggested in theorem 2.5, only that this time joint concavity in the two decision variables y and p is used to show the optimality of a base-stock policy. When comparing theorem 2.7 to theorem I in Federgruen and Heching (1999), note that there the optimal profit is reduced by the proportional costs of the stock on hand, ex, and thus concavity in x is trivially given, since then the value function before maximization J(y,p) no longer depends on x.
• We now turn our attention to the list-price property, which we define in the following: Definition 2.2 (Base-stock-list-price policy). A base-stock-list-price policy strongly relates to a base-stock policy (see definition 2.1 ). If the initial inventory level is below the base-stock level, an order is placed to raise the inventory level to the base-stock level and a so-called list-price is charged. Otherwise, no order is placed and a discount price is offered, which is a non-increasing function of the initial inventory (see figure 2.1 and 2.7 for a graphical description of such a policy): where p*(x) :S P* and p*(x) non-increasing in x.

Definition 2.3 (Submodularity). A submodular/subadditive function is a function f(x. y)
that has monotone decreasing differences, which means that for all x+ 2' : x-E X and y+ 2' : Proof To prove this theorem it suffices by theorem 8-4 in Heyman and Sobel (2004) to show that J1(x, y,p) is submodular in y and p (see definition 2.3) for any time period t = l ... T.
Since the sum of submodular functions is submodular, we need to show submodularity of each of the terms in 2.3.16. The first and second terms are trivially submodular since they depend only on one of the two variables y and p. In order to show that G(y. p) has monotone decreasing differences we define H(x) to be actual holding/backlogging costs in and p we consider an arbitrary pair of inventory levels (y-, y+) and any pair of price levels (p-, p+) with y-< y+ and p-< p+. We furthermore define From /31 < 0 it follows naturally that x-+ > x--. By convexity of H(·) we have: (2.3.21) Thus by definition 2.3, it is obvious that H(y -D(p, Et)) is supermodular in y and p.
Since taking expected values preserves the submodularity property, it is clear that -G(y, p) is submodular. Finally, the submodularity proof for the last term in 2.3.16 is identical to the one of -G(y,p),as ½+1(x) isconcavein x bytheorem2.7.
• Remark 2.9. The optimality of a list-price policy can be motivated by the intuition that holding costs of unnecessarily high inventory levels x can be reduced by accelerating demand via reducing the selling price p .
Similar to theorem 2.6, we can find a possible steady state for the joint pricing and inventory control model (2.3.15).
Theorem 2.9. If the dynamic program (2.3.15) admits a steady state, then it is given by Proof. The optimal total discounted profit for an infinite time-horizon is given by (compare equation (2.3.1 )): Since the profit function V(xi) is concave in y (see theorem 2.6), the optimization problem in the two variables p and y can be reduced to an optimization problem in a single variable p. This can be done by first solving for the optimal value of y as a function of p and then substituting the result y;.;(p) back into V(xi) and solving in p. y;.;(p) can be obtained in the same manner as in theorem 2.6, with the only difference that now p also is a decision variable. For convenience we call y 0 = F-1 ( b-~1 .;b-r)c) . We now use the optimal inventory afterordering y;.;(p) = E[D(p. ,)] +y 0 (compare theorem 2.6) as an input for y in equation which is a constant and no longer depends on p.   Furthermore, it can be observed that at the beginning of the time horizon, a base-stock that is greater than the expected demand, is kept in order to hedge against expensive backlogging costs. Towards the end of the time horizon, the risk increases of not selling inventories on stock, which motivates decreasing base-stock levels. Note that in the last time period a negative safety stock Sf -E[D(PT, er)] < 0 is observed. Figure 2.9 shows the optimal price p*(x1) in inventory before ordering x1 for different time periods and the above parameter setting. It is easy to see that the optimal policy is a listprice policy in any time period t. Furthermore, it can be seen that the tendency to give price discounts at lower inventory levels before ordering increases over time. This is intuitive since, as described above, the model aims at reducing inventory levels towards the end of the time horizon and a lower price for higher inventory levels results in a higher expected demand which then results in lower inventory levels after demand is realized.
Chen and Simchi-Levi (2004a) later extend the above described results for more general demand functions and the case of nonzero fixed ordering costs. They prove that the optimal profit function is k-concave (symmetric k-concave) and find that (s,S,p)-policy is optimal. In such a policy the inventory is managed by the classical ( s, S )-policy and price is determined based on the inventory level at the beginning of each period. In a different paper, Chen and Simchi-Levi (2004b) investigate the infinite horizon problem. Since we are neither focusing on the case of nonzero fixed ordering costs nor on the infinite horizon in this thesis, we omit further details on those contributions here. For the convenience of the reader we give a brief overview of the notation used in this section in  Vi(x) expected optimal profit-to-go function

Problem description
Research in marketing demonstrates that in markets with repeated interactions, demand not only depends on the current price but is also sensitive to the firm's pricing history and thus accounts for intertemporal demand correlations. The aim of these approaches is to assess optimal prices with respect to maximizing total expected profit, taking demand fulfillment for granted. Since consumers have a memory, the carrier of price is not only based on its absolute level, but rather on its deviation from some reference level resulting from the pricing history.
As customers revisit the firm, they develop price expectations, which become a benchmark against which current prices are compared. A formulation that captures this effect is the socalled reference price, which is a standard price against which consumers evaluate the actual prices of products they are considering (see e.g. Winer ( 1986), Greenleaf ( 1995), Kopalle et al. (1996 ), Briesch et al. (1997), Fibich et al. (2003 ), Mazumdar et al. (2005 ), Natter et al. (2006)). If the price is below the reference price, the observed price is lower than anticipated, resulting in a perceived gain. This would make a purchase more attractive and raise demand. Similarly, the opposite situation would result in a perceived loss, reducing the probability of a purchase (people are less likely to buy products after prices have gone up). An important consequence of this reference price formation is that although frequent price discounts may be beneficial in the short run, they may be dangerous in the long run when consumers get used to these discounts and reference prices drop. The reduced price becomes anticipated and loses its effectiveness, whereas the non-promoted price becomes unanticipated and would be perceived as a loss. Thus, the optimal price policy becomes dynamic, with the reference price being the state variable. Popescu and Wu (2007) e.g. show that if the reference level is initially high, an optimizing firm will often consistently price below this level, which has the effect of a skimming strategy. Similarly, a low initial reference level leads to the optimality of a penetration type strategy. The reference price effect can be integrated into the demand model by modeling the reference price as the weighted sum of the previous reference price and the previous price set (exponential smoothing) and by adding an additional term to the response function where a positive reference gap (current price is lower than the reference price) increases demand, while a negative reference gap decreases demand (see equation (3.2.1 )). expected optimal profit-to-go function The above described marketing model aims at setting optimal prices without considering inventory decisions by taking demand fulfillment for granted. As in chapter 2, we consider a finite horizon, stochastic, single item and periodic review model under a monopolistic setting. Demand perturbations in consecutive periods are independent and their distribution depends on the item's price and the consumer's reference price, which is based on the pricing history. Including reference price effects, we redefine the additive demand function used in chapter 2 in the following way: Definition 3.1. The stochastic demand is modeled by the piecewise linear function where rt denotes the reference pricing in period t, Et is iid. according to an arbitrary  buyers are loss-neutral and the demand function is smooth. For loss-averse consumers, the demand function is steeper for losses than for gains ( /32 < /33 ) and consumers respond more to surcharges than to discounts. In other words, a loss decreases expected demand more than an equivalently sized gain would increase demand (see figure 3.1 ). This behavior is predicted by Prospect Theory (see e.g. Winer ( 1986)). If /32 > /33 , consumers are loss-seeking. As Slonim and Garbarino (2002) show, /32 > /33 can also arise on the aggregate level when in fact the consumers behave according to Prospect Theory but stockpile when prices are low. We will focus on the loss-neutral and loss-averse case, which yield closed-formed steady state solutions, while the optimal pricing policy in the loss-seeking case cycles (see Popescu and Wu (2007) and figure 3.8).
The reference price r1 in equation (3.2.1) is given by some updating mechanism based on past prices such that recent occasions have greater effects than more distant ones and a higher previous price results in a higher current reference price. In the literature we observe several ways a reference price can be formed. One introduced by Krishnamurthi et al. (1992) is to operationalize reference price as the one-period lagged price for a brand: r1 = p1_ 1 . Another way could be summing past prices (see e.g. Winer ( 1986)). Exponential smoothing (introduced in the adaptive expectations framework by Nerlove ( 1985)) is the most commonly used and empirically validated reference price mechanism in literature (see e.g. Winer (1986), Greenleaf ( 1995), Kopalle et al. (1996), Fibich et al. (2003, Popescu and Wu (2007)): Definition 3.2. Let Pt denote the observed selling price and r1 the reference price for a specific brand in period t, then a reference price updating mechanism is given by Remark 3.2. Note that lower values a represent a shorter term memory. In particular, if a = 0, the reference price is the one-period lagged price ( rt = Pt-I) as in Krishnamurthi et al. (1992).
The memory parameter a used in equation (3.2.2) is estimated in such a way that the highest possible R 2 (quantifying the goodness of fit) for equation (3.2.1) is obtained in ordinary least square regression. Statistical parameter estimation is beyond the scope of this thesis, but there is considerable literature on empirical studies (e.g. Greenleaf (1995), Tellis (1988), Bijmolt et al. (2005), Ho and Zhang (2004), Popescu and Wu (2007), Natter et al. (2006), etc.) finding that a demand model like equation (3.2.1) fits empirical data very weJI and giving estimated parameter values for their models based on time-series data. E.g. Greenleaf (1995) and KopaJie et al. (1996) find furthermore that estimated parameters of a range from [O, 0.925]. For a more detailed exposition of reference price mechanisms, see e.g. Kopalle and Lindsey-Mullikin (2003) and Moon et al. (2006). Similar to chapter 2, pricing decisions Pt , where p_ ' .S Pt ' .S fi , are made at the beginning of each period t with the objective of maximizing total expected discounted profit over the entire planning horizon T. In marketing models, demand fulfillment is taken for granted and thus no inventory decisions are considered. Costs c in each time period t are again assumed to be time invariant. Therefore the maximum total expected discounted profit V(ri), when the initial reference price at the beginning of the planning horizon is given by r 1 and cash flows occurring in subsequent time periods are discounted by a one-period discount factor Lisa Gimpl-Heersink -978-3-631-75380-4 Downloaded from PubFactory at 01/11/2019 05:41:29AM via free access , is given as follows: Similar to section 2.3, the decision which price to charge in each period is made in stages and cannot be viewed in isolation. Here again, the desire for high present profits, obtained by charging relatively low prices, must be balanced against the undesirability of low future profits, resulting from the formation of a low reference price as a consequence of the earlier price discounts. As the reference price summarizes past information which is relevant for future optimization, dynamic programming, which was introduced in section 2.3.1, is an appropriate technique for solving this problem. We thus rewrite equation (3.2.3) in terms of the Bellman equation: with Vr+1 = 0. Note that state variables r 1 and decision variables p1 are related via equation (3.2.2). For the convenience of the reader, the notation used in this chapter is briefly summarized in table 3.1 at the beginning of this section.

Loss-neutral customer behavior
An important consequence of the reference price formation described above is that although frequent price discounts may be beneficial in the short run, they may be dangerous in the long run when consumers get used to these discounts and reference prices drop. The reduced price becomes anticipated and loses its effectiveness, whereas the non-promoted price becomes unanticipated and would be perceived as a loss. As in section 2.3, we are interested in a possible steady state. Convergence analysis leads to the following theorem: Proof The optimal total expected discounted profit for an infinite time horizon is given by (compare (3.2.3)): where r 1 denotes the starting reference price in the current period. Substituting the transition function (3.2.2) as we did when proving theorem 2.6 and 2.9, is not sufficient here since rt+1 = a(rt) + (1a)pt still depends on the reference price rt. Thus we substitute Pt = p and express the reference price rt+1 in terms of the starting reference price r 1 and p: which for the linear demand and loss-neutral case becomes Differentiating V(ri) with respect to p and setting equal to 0 yields Using the assumption that the reference price is in a steady state r we set r1 = r for all t. • For non-differentiable demand functions and non-negative ordering costs the proof of theorem 3.1 needs to be adjusted to a variational approach used in (Popescu and Wu 2007, Proof of theorem I). For consistency we give a short sketch of the proof in the following: Theorem 3.2. If the dynamic program (3.2.4) admits a steady state, then for loss-neutral customer behavior ( /32 = /33) it is given by: Proof The proof of (Popescu and Wu 2007, theorem I) is adjusted in such a way that the one-period expected profit IT(p, r) is given by IT(p, r)   • Popescu and Wu (2007) show (in theorem 2 and lemma 4) that if the system admits a steady state, the optimal price path converges monotonously (under certain assumptions) to a constant steady state which they only give for the case of zero ordering costs and we in the above theorem extended to the case of c ~ 0. Furthermore, Kopalle et al. ( 1996) and Popescu and Wu (2007) show for time invariant parameters that if the reference level is initially high, an optimizing firm should consistently price below this level, which has the effect of a skimming strategy. Similarly, a low initial reference level leads to the optimality of a penetration type strategy. A numerical example for /30 = 100, /31 = -20, /32 = /33 = -40, o = 0.5, 1 = 0.5 and c = 4 is given in figure 3.3 Remark 3.3. Note that the numerical example in figure 3.3 was obtained for a finite planning horizon T = 40 . When considering a finite planning horizon, it is observed that prices are lowered towards the end of the horizon in order to benefit from reference effects. Keeping consumers' price expectations high is not reasonable when the end of a product's life cycle is reached and the remaining future selling periods do not suffice to outweigh the loss of profit a price promotion would induce in the promotion period. In this chapter we mainly focus on a steady state analysis (according to Popescu and Wu (2007)) and use a planning horizon T = 40 for numerical calculations, but only plot the results for a period of T = 25 in order to eliminate the transient behavior at the end of the horizon and thus simulate an infinite horizon behavior.
In the following we give some upper and lower bounds for the optimal steady state price p~ and a sensitivity analysis in its parameters a, ' Y and /32 • Theorem 3.3. The optimal steady state price p~ is decreasing in the memory parameter a, increasing in the discount factor ' Y and decreasing in the reference effect l/321. Furthermore, it satisfies (3.3. 12) where p' denotes the myopic, one period profit optimizing price ( ' Y = 0) and Rx, denotes the optimal price in the absence of reference price ( /32 = 0 ).
Proof Since by definition 3.1 prices p are assumed to be greater than costs c and expected demand is assumed to be non-negative, it is clear that /30 + /31 c ~ 0. We furthermore know that the expected demand is increasing in reference price ( /32 ::; 0 ) and the discount factor ' Y is bounded by [0, 1). By differentiating p~ with respect to a, ' Y and /32 , we thus obtain

8/32
(2/31(1-a"()+ /32(1-'Y)) 2 > O, which shows that the optimal steady state price p~ is decreasing in a, increasing in ' Y and decreasing in l/321. Since p~ increases in 'Y, the first part of equation ( Remark 3.5. We observe that p* s p* from the above theorem states that the prices charged by a myopic firm are oblivious of their eroding effect on future demand, hence future profits, and thus are lesser than or equal to the optimal price. By charging higher prices, current profits are traded off for future long-term profitability from higher reference prices. Furthermore, p* S rx shows that the optimal price in the absence of a reference price is always greater than or equal to the optimal price obtained from the model including reference effects, respectively. This means that in the Jong run, strategic firms should charge lower prices when consumers form reference effects, than when they do not. with changes in the input parameters /32 , -y and a. Furthermore, note that the optimal steady state price in absence of reference price is given by rx = 4.5. We now extend the above results to the Joss-averse case ( /32 < {33 ) and give the reader an idea about what happens in the loss-seeking case ( /32 > /33 ).

Loss-averse and loss-seeking customer behavior
This section investigates the transient and long term behavior of the optimal dynamic pricing policy under Joss-averse customer behavior.  where p; denotes a penetration type and p; denotes a skimming type steady state solution.
For a proof of theorem 3.4 we refer the reader to (Popescu and Wu 2007, theorem 4 ).
Remark 3.6. From equation (3.3.16) it becomes clear that in the loss-averse case there exists more than one steady state, depending on the initial reference price r 1 . If initial price expectations are lower than p;, the optimal pricing strategy initially starts with a low price and monotonously increases prices until the steady state p; is reached. If initial price expectations are higher than p; , the optimal pricing strategy initially starts with a high price and monotonously decreases prices until the steady state p; is reached. For initial price expectations lying between the two possible steady states p; and p; , a constant pricing policy of the customer's initial price expectation r 1 is optimal. more than one steady state, depending on the initial reference price r 1 , is provided in figure   3.7. The figure shows that for initial reference prices p; S r 1 S p; the decision variable p*(ri) equals the state variable r 1 and thus the steady state is already reached. Letting r 1 2 p; or r 1 s p;, the optimal prices p; and reference prices r; converge monotonously over time until they reach the steady state p; or p;, respectively.
If buyers are loss seeking ( (32 > (33 ) and thus respond more to discounts than to sur- .g Q.

Introduction
In the previous chapters we gave the reader a review of the state-of-the-art methodological literature and found that it divides into two rather distinct streams: The operations oriented stream described in chapter 2 and the marketing oriented stream described in chapter 3.
Operations management oriented work, introduced in chapter 2, mainly deals with determining optimal production decisions and thus usually describes a firm's possible cost structure very well: costs are assumed to be non-stationary, meaning that they can vary over time and fixed costs can be in included in the model. The limitation of this model is that it clearly relies on rather simplistic demand assumptions. Demand is modeled as a function of the current price only, not taking into account past prices, which clearly also influence customers' buying decisions.
The strength of the marketing models, introduced in chapter 3, is its rich demand model, which accounts for intertemporal demand correlations by incorporating both current price and the firm's pricing history in the model. Yet, it has serious limitations as it uses a very simplistic cost structure which does not account for supply chain management interactions (e.g. stationary variable costs). But what is even more restricting, is that marketing takes demand fulfillment for granted. Thus, although the demand function is defined stochastically (see definition 3.1 ), the problem, by maximizing expected profits, diminishes to a deterministic setting. In conclusion, both prevalent research streams consider only a partial picture of the relevant system. This work is devoted to combining the two above described literature streams: we want to use the rich cost models commonly used in operations research and combine them with demand models, which account for intertemporal demand correlation and have been mainly applied by marketing so far.

Model formulation
When using mathematics to solve real world problems our main aim is to obtain a mathematical model that describes or represents the real situation as well as possible. The formulation Lisa Gimpl-Heersink -978-3-631-75380-4 Downloaded from PubFactory at 01/11/2019 05:41:29AM via free access of a mathematical model is a challenging task: first the scientist needs to understand the problem and then has to identify generalizable principles and processes such that a complex system can be simplified/reduced to a tractable level that makes the essential structure of the system clear. Variables have to be defined and relationships between these variables have to be established. The key to a good model lies in which and how simplifications are introduced; it is very important to understand what aspects of the system the model is intended to describe, and at which the model's limitations are as a result of the simplification. Hence, we now turn to the formulation of an integrated inventory control and pricing model, combining the strength and benefiting from the dynamics of each of the two models introduced in chapter 2 and 3. This new model should be much better capable of representing the real world situation.
We consider a monopolistic retailer or manufacturer who maintains an inventory of a particular product and prior to facing random demand in each period t of a finite horizon T jointly determines a selling price and a stocking quantity. The integrated model with reference price effects is a combination of the two models introduced in chapter 2 and chapter 3. Thus, we only briefly describe the variables used and refer the reader to the two previous chapters for further detail. Demand perturbations ct in different periods is assumed to be statistically independent and identically distributed according to general stochastic demand functions. We furthermore assume that consumers have a memory and demand not only depends on the current selling price Pt, but also on a reference level r1 resulting from the pricing history (see chapter 3). The inventory level Xt is reviewed at regular intervals (periodic review model), and an appropriate Yt -Xt is ordered and a per unit selling price Pt charged after each review at the beginning of a new period t. As in the previous chapters, we assume that all input variables are stationary and thus do not change over time. The ordering costs include a per unit variable cost c 2' : 0 and a fixed setup cost k 2' : 0 which is incurred only if an order is placed ( Yt > Xt ). Again, as in chapter 2, orders placed are essentially received immediately (received in time to meet demand that arises in that period).
Costs are expressed in beginning-of-period cash units, cash flows occurring in subsequent time periods are discounted by one period discount factor 'YE (0, 1). Each unit of positive left over stock at the end of each period incurs holding costs h 2' : 0 . If demand exceeds the inventory on hand, per unit backlogging (penalty) costs b are charged and demand is filled when the additional inventory becomes available. To insure that it is not optimal to never order anything and merely accumulate backlog penalty costs, we assume that b > (l -'Y )c.
After the last period, a final order is placed to fulfill backlogged demand. Furthermore, we for simplicity of the model assume that leftover units at the end of the horizon can be salvaged at the original ordering costs ( v = c in chapter 2). For the convenience of the reader the notation is summarized in table 4.1.
In the following, we model demand additively as in chapter 3 such that it is decreasing in price Pt . Furthermore, consumers perceive a gain and thus demand increases, if the current price is lower than the reference price (Pt < Tt) and consumers perceive a loss and demand decreases, if the current price is higher than the reference price (Pt > Tt ).  Reference price rt is formed by exponential smoothing as in chapter 3. For the convenience of the reader we again give the definition here: Definition 4.2. Let Pt denote the observed selling price and rt the reference price for a specific brand in period t, then for the memory parameter O ~ o < 1 a reference price updating mechanism is given by r1+1 =Oft+ (1o) As in the previous chapters the objective is to maximize expected profit over the entire planning horizon T. The maximum total expected discounted profit v(x1, r 1 ), when the initial inventory level at the beginning of the planning horizon is given by x 1 and the initial reference price is given by r 1 , is given as follows: (4.2.3) Similar to section 2.3, the decision, which price to charge in each period and how much to order, is made in stages. Again the desire for high present profits, obtained by charging relatively low prices, must be balanced against the undesirability of low future profits. As described in section 2.3.1 this tradeoff is very well captured by the technique of dynamic programming.

Dynamic program
We now reformulate problem (4.2.2) in terms of dynamic programming using backward recursion as in section 2.   Note that equation (4.3.4) gives the gross quantity of stock at the beginning of period t + 1, which equals the inventory on hand after ordering at the beginning of period t less the total quantity actually sold during that period ( we refer the reader to figure 2.2 for a graphical illustration). Equation (4.3.5) gives the consumers' reference price in period t + 1 which is formed from past prices by exponential smoothing with a memory parameter a (see figure   3.2). A brief idea of the system dynamics is given in figure 4.1.
In the following we focus on the special case of zero fixed ordering costs ( k = 0 like in section 2.3.2 and 2.3.3). We then transform the above dynamic program in such a way that

Analytical Analysis of the Integrated Model
The system of the integrated model consists of two state and two decision variables (see table  4.2) and is thus much more complex than the one discussed in chapter 2. This complicates an analytical analysis significantly and we can only prove structural results under restrictive assumptions. We start off by analyzing the single-period problem (newsvendor model) and then extend some of the obtained results to the two-period case. However, even in this simplest version of the model, we only consider loss neutral customer behavior ( /32 = /33 ) to ensure analytical tractability. Thus the demand model (4.2.1) reduces to (5.0.1) Furthermore, for section 5.2 we assume that the random variable £, which follows an arbitrary probability function f (·),is continuous and differentiable in order to avoid additional complexity of the analytical analysis for the two period model. At the end of this chapter, for the multi-period case we give an extension of the proof of Federgruen and Heching ( 1999) and show the optimality of a base-stock policy in the integrated model. After all, those results are not of extreme practical relevance, since several assumptions on the demand and revenues have to be made, which many commonly used demand functions including the linear one, defined in equation (5.0.1 ), do not fulfill. However, we will provide an extensive numerical study in chapter 6, where we show that in the cases analyzed the obtained results still hold under much less restrictive assumptions.

One-period model
We will start our analysis of equation (4.3.1) with the last period. The optimality of a basestock-list-price policy follows directly from Federgruen and Heching (1999), since the reference price r is only an additional parameter for the one period case. In this section however, we will provide an alternative proof and give implicit solutions for the optimal price and inventory level with respect to reference price, which will be used to extend the base-stocklist-price result to a two-period setting in section 5.2. To simplify the notation we in the

•
The above two lemmas lead to an optimal pricing and ordering policy (compare section 2.2).
Theorem 5.1 (Base-stock-list-price policy). For the linear demand function defined in equation (5.0.1) and the system (4.3.1) to (4.3.3) and b > ( 1 -1 )c, the optimal policy for the one-period case is a base-stock-list-price policy, where y*(x. r) and p*(x. r) are given by *( . ) { S*(P*(r). r) . . x < S*(P*(r). r) . else (5.1.14) p*(x.r) = { P*(r.) .x < S*(P*(r)),r) · p*(x. r) . else  Since y -E[D(p,r,t:)] is trivially jointly concave in y and p, it follows directly from lemma 5.1 that the expected total profit J(x, y, p, r) is jointly concave in y and p. Thus the optimization problem over the two variables y and p can be reduced to an optimization problem over the single variable y as a function of p with subsequent substitution of the result back into J ( x, y, p, r) , which is thereafter solved for p. We in the following show the optimality of a base-stock policy and provide an optimal order-up-to level S*(r) as a function of price p and then continue with the price optimization. By setting equation

Furthermore, the discounted price p*(x, r) is given implicitly by E[D(p*(x, r), r, c)] + (p*(x, r) -"(C -b)E[Dp(p*(x, r ), r, c)]+
8y equal to zero we obtain the solution to maxy J(x, y, p, r) which is denoted by y(x,r,p)=F h+b +E[D(p,r,t:)]. For proving equations (5.1. I 7) and (5.1.18) we now need to distinguish between the two cases x < S*(p, r) and x 2 S*(p, r).
Let X < S*(p, r). For notational convenience we denote y 0 := p-l e-l 1 ;b'J)C), which yields S*(p, r) = y 0 + E [D(p, r, c)]. In order to find the optimal list-price p we substitute

r, t))F(x -E[D(p. r, t)]).
Since J(x, y. p, r) is jointly concave in p and y equation (5.1.18) results from setting equation (5.1.26) equal to zero.
In order to prove the optimality of a list-price policy, we need to show that p*(x. r) is unique and non-increasing in x. By equation (5.1.19) and demand D(p, r, E) being concave in p it follows by lemma 5.1 that J(x, y, p, r) is strictly concave in p and thus the optimal price p*(x.r) is unique. Furthennore, J(x,y.p.r) issubmodularin y and p. It follows from theorem 2.8.1 in Topkis (I 998) that the optimal price p* (y, r) is non-increasing in y and hence in x. Substituting the optimal list price p = P*(r) in equation (  give a graphical description of the policy, which we showed to be optimal in the above theorem. They are an extension to figure 2.1 and figure 2. 7 in chapter 2, with the difference that now the optimal decisions depend on two states: the inventory before ordering x and the reference price r. Note that the dependency of the optimal price p* and optimal inventory level after ordering y• on reference price r adds an additional dimension to the solution space. It becomes clear that in contrast to chapter 2, the base-stock level S*(r) and the optimal price p*(x, r) depend on the consumers' price expectation r.
Looking at figures 5.1 and 5.2, the question arises whether new structural properties of the optimal policies in the reference price r can be formulated. This leads to the following theorem, where for the one period case and loss-neutral customer behavior we show that both the optimal pricing and ordering policy are non-decreasing in the reference price r.
To prove theorem 5.2 we introduce the theory of implicit differentiation (Heuser 1981, theorem 170.1) in the lemma below. While in the loss-neutral case the list-price P* ( r) is a smooth (continuous) function in r , since expected demand is a smooth function in p and r due to (32 = (33 , in the loss-averse case expected demand is a kinked function in p and r (with two different slopes depending on p :S r or p > r, respectively) and therefore the list-price P*(r) is a kinked function in r. Note that this relation results in a list-price P* ( r) that is threefold: P* ( r) < r, P*(r) = r and P*(r) > r. It is clear that for all reference prices r with P*(r) = r the corresponding base-stock level S*(r, r) from equation ( and thus an optimal inventory policy is no longer monotonous in reference price r for lossaverse customer behavior (see figure 5.4).
From the above remark it becomes clear that loss-averse customer behavior already adds considerable complexity to the model in the one-period case and thus significant additional dynamics in the multi-period setting. In this thesis we mainly concentrate on the loss neutral case.

Two-period model
We are now in the position to add one period to the planning horizon and study the two period case. This is not only a theoretical exercise but also has significant practical relevance on its own: due to shortening life-cycles an increasing fraction of a retailer's assortment consists of products where there is only one reorder possibility. Due to long production lead times there is as a consequence only one possibility for reordering after the initial order is placed, which motivates the application of a two-time-period model.
The dynamic program defined in (4.3.1) to (4.3.3) will be used in this section for t = l. 2.

y1-EID]
(5.2.8) We now consider the case S* ( r 2 ) < 0, which yields (5.2.9) We will now show that the four possible summands are jointly concave in y1 and p1 • Proof. The first two functions are trivially jointly concave in y1 and p1 , since the two possible realizations of the profit function in the first time-period Il}(x1, y1, p1, r 1, ci) and m(x1, Y1,P1, r1, ci) are jointly concave by lemma A.I in appendix A and ½*(r2(P1, r1)) is jointly concave by lemma A.2 with a discount factor 1' > 0.
However, in the following we will show that the joint concavity of the first time period's profits TI1 ( x1, y1, p1, r 1 , € i) is strong enough to dominate the non-concavity of the future profit. It is easy to see that the 'misbehavior' of non-concavity will be worse, the larger the discount factor "I is. Hence, without loss of generality, we examine the case of no discount ( "I = 1) in the following analysis.
• It now remains to be shown that the profit functions of the first and second time-period, respectively, are continuous and concave in their points of non-differentiability.

•
We are now ready to show the optimality of a base-stock-list-price policy, for which we introduce another useful lemma, which can be found in Heuser ( 1981 ). 2. The profit function lt (yt, Pt, Xt, rt) is jointly concave in Yt and Pt.
3. The profit function lt (Yt,Pt,x1,rt) is submodular in Yt and Pt. Proof The above statement is true for the second time period t = 2 by theorem 5.1. It remains to be shown that 11 (x 1. y1, p1, r1) is jointly concave and submodular in y1 and p1 .

Multi-period model
Under some restrictive assumptions we are able to extend the above attained base-stock property to the multi-period-case. For the ease of proving we now consider the transformed model given by equations (4.3.7) to (4.3.9) in chapter 4. Furthermore, we introduce the following assumptions: Assumption 5.1. In each time period t = 1, ... , T the following holds: The demand function D1(p, r, c) is non-increasing in p, non-decreasing in r and jointly concave in p and r, while the revenues pD1 (p, r, t) are assumed to be jointly concave in p and r . Furthermore, G1(y, p, r) is assumed to be jointly convex in y, p and r.
Lemma 5.9. The expected profit-to-go function ½( r, x) is non-increasing in x for all r and t = 1, ... , T.
• Theorem 5.4 (Base-stock policy). For the system (4.3.7) to (4.3.9) and assumption 5.1, the following holds for any time period t = 1, ... , T: Proof The proof technique is similar to the one suggested in theorem 2.5 and is conducted by induction. Vr+ 1 (x, r) = O is trivially jointly concave in x and r and thus Jy(y, p. r) is jointly concave in y, p and r by assumption 5.1. We now assume that ½+1 (x, r) is jointly concave in x and r and Jt+ 1 (y, p, r) is jointly concave in y, p and r.
We now show that Vi ( x, r) is jointly concave in x and r. From Since J1(y. p. r) is jointly concave in y and p a base-stock policy with a base-stock level S*(r) and an optimal price p*(x, r) is optimal and thus the optimal profit Vi(x, r) can be Vt(x, r) = lt(S*(r),p*(x, r), r). (5.3.6) It furthennore follows that (5.3.7) The first inequality of (5.3.7) holds since p* (x 1 !"'', ri!'') and max (S* (' 1 !''), xi!"'') are the global optima of Vi ( "'1 !"'', '1 !'') . Any other solution, particularly p•(x,,r,)~p•(x 2 ,r 2 ) and max(S•(,i),x,)~max(S•(, 2 l,x 2 ) will thus be less or equal to the optimal solution. lt (Y,P, r) being jointly concave in y,p and r explains the second inequality of (5.3.7). Proof. The proof is a combination of the proofs of theorem 2.9 in section 2.3.3 and theorem 3.1 in section 3.3. As in the proof of the joint pricing and inventory without reference effect case, we first find the steady state inventory. Again the optimal total discounted profit for an Differentiation with respect to y and using the same arguments as in the proof of theorem 2.9, we get the steady state base-stock equation (5.3.9) depending on the price. Note that the reference price has to be equal to the price as this is necessary for the steady state price. The steady state price in formula (5.3.8) is found analogously to the proof of the steady state for the reference price model (see theorem 3.1) after substituting y;._ (p) for y in the infinite sum (5.3.10) and differentiating with respect to the price p. This yields exactly the same steady state price p;._ as in theorem 3.1.   Remark 5.6. From remark 5.5 it is clear that a sensitivity analysis of the steady state price p~ of equation (5.3.8) is identical to the one described in theorem 3.3 in section 3.3 and thus the optimal steady state price p~ is decreasing in the memory parameter a , increasing in the discount factor , and decreasing in the reference effect j,621.
Since F-1 c-~1;b-y)c) is independent of both a and l,621, p~ is decreasing in a and l,B2I and expected demand is decreasing in price, it is clear that E [D(p~, p~, c)] is increasing in a and j,621. As a consequence, y;., is increasing in a and j,621. However, for the discount factor 1 , we cannot make a definite statement about the behavior of the steady state basestock y;.,, as the safety stock is increasing in I while E[D(p~,P~, c)] is decreasing in 1 .
As a result, in case of small uncertainty in demand the base-stock is decreasing in I whereas for large uncertainty the base-stock is increasing in 1 . A numerical example is given in

Simulations and Numerical Investigations
The functional equations described in (4.3.1) to (4.3.3) and (4.3.7) to (4.3.9) in chapter4 lack a closed-form solution, except in a very limited number of special cases, like the one-period case presented in section 5.1. Hence, the main focus of chapter 5 was on showing that an optimal solution theoretically exists and is unique, which then resulted in a simple optimal pricing and inventory control policy, like a base-stock-list-price policy. In this chapter we complement these analytical results with a numerical study. We shall relax some of the rather strong assumptions of chapter 5 in order to assess the robustness of its implications and give explicit optimal solutions. Furthermore we will investigate, how the optimal inventory and pricing paths evolve over time and give a qualitative understanding of the obtained results.
Moreover we aim at examining whether the model's solutions have some additional structural properties in the reference price.
We shall study a discrete time, discrete state Markov decision model, since the model under consideration is a periodic review model, and demand realization can only be integer values (we cannot sell a fraction of an item). We already described in chapter 4 that we are facing a two-dimensional state space (comprising an 'internal' state describing the production system and an 'external' state related to the market) and a two-dimensional action space (reflecting both marketing and production/logistic decisions). Thus we are likely to run into the 'curse of dimensionality' (the tendency for the solution time to grow exponentially with the dimensionality of the state or action space) where some theoretical insights to efficient computer programming are required and the art of programming lies in finding a trade-off between memory-intensive and run time-intensive computational methods.
All results of this chapter are obtained by a dynamic program using backward recursion developed in the numerical computing environment MATLAB. The Compecon MATLAB toolbox, provided by Miranda and Fackler (2002) and especially designed to solve stochastic dynamic economic models, could not be used for the integrated model including reference price effects due to the memory intensive data structure of their algorithms for higher dimensional state spaces. Therefore we developed a new algorithm particularly suitable for our setting described in chapter 4. In order to guarantee numerical stability and control roundoff errors, we implemented a spline interpolation on the value function with respect to the state and decision space, which also reduced computation times significantly.
By systematically varying the demand parameters /30 , /31 , /32 , the memory parameter a, the discount factor , , the costs c, b and h, and the length of the planning horizon T Lisa Gimpl-Heersink -978-3-631-75380-4 Downloaded from PubFactory at 01/11/2019 05:41:29AM via free access 80 CHAPTER 6. SIMULATIONS AND NUMERICAL INVESTIGATIONS we find strong evidence that a base-stock-list-price policy is optimal under any considered setting. Hence, we only present illustrative examples in sections 6.1 to 6.3.
6.1. Loss-neutral customer behavior 6.1.1. The optimal policy's structure As a first step we will concentrate on loss-neutral customer behavior ( (32 = (33 ), which we investigated analytically in chapter 5. Although there we could only show the optimality of a base-stock-list-price policy for a two-period problem, we now find that this structure also extends to any time period of a finite time horizon with length T > 2 . Extensive numerical studies give strong evidence that the expected profit J1(y, p, r) is jointly concave in any time period 1 S t S T in inventory level after ordering y and selling price p for any reference price r and thus a base-stock-list-price policy is optimal for any time-period t. Furthermore, we observe that the steady state solutions (5.3.8) and (5.3.9), derived analytically in section 5.3, are already attained within a relatively short planning horizon ( T = 15 for the below base scenario).
Figures 6.2 to 6.9 on page 82 and following pages give an illustrative example for the following base scenario: Expected loss-neutral demand is given by the parameters (30 = 100, (31 = -20 and (32 = -40, whereas random demand follows a normal distribution with fixed standard deviation u = 20. Because of the cost of capital, maintenance, insurance, loss and damage, the per period holding cost rates amount to approximately one percent of the ordering costs: c = 0.5 and h = 0.005. High service levels are ensured by setting the backlogging cost rates about the same magnitude as the ordering costs: b = 0.4. Moreover, a memory parameter is given by a= 0.5, the discount factor is set to 'Y = 0.8 and the total length of the planning horizon is given by T = 15. We want to mention here that all the results described below hold for any other tested parameter setting and interestingly, also for non-linear expected demand functions.  Figure 6.2 shows that the expected profit J1(y,p, r) is jointly concave in inventory y and price p. Moreover, it can be seen in figure 6.3 that the optimal expected profit Vi(x, r) is jointly concave in inventory x and reference price r and increasing in both x and r. Therefore a base-stock-list-price policy is optimal for any reference price r (see figures 6.4 and 6.5). Thus, for simplicity, the two three-dimensional graphs of the optimal inventory level y*(x, r) and the optimal price p'(x, r) can be reduced to two two-dimensional graphs of the base-stock level S'(r) and list-price P'(r), which can be seen to be increasing in reference price r (see figures 6.8 and 6.9).
We now take a closer look at the behavior of the optimal decisions y* and p* in the state variables x and r over time. In section 2.3.3 we observed that for the model introduced by Federgruen and Heching (1999) price discounts were given at a higher inventory level, the ). This does not necessarily hold for the integrated model. Here, due to negative carry over effects, price discounts are generally smaller and are given as soon as the inventory level before ordering is higher than the base-stock level (see figure 6.7). Since we give smaller discounts, we have to react earlier in time in order to move the inventory level before ordering below the steady state level and thus reach a possible steady state. The model in section 2.3.3 aimed at reducing base-stocks over time and thus list-prices were increasing over time as a consequence. For a fixed standard deviation of demand, list-prices tended to be constant over time (see figure 2.8). In contrast, the integrated model including reference price effects behaves qualitatively completely differently. Here, similar to figure 3.3, list-prices are decreasing over time in order to benefit from the reference price effects and are increasing in reference price r (see figure 6.9). With decreasing prices over time (see figure 6.7 and 6.9) and also the resulting reference price effect, expected demands are increasing, necessitating higher base-stock levels over time (see figure 6.6 and 6.8).

The influence of the demand distribution
In this subsection we want to investigate the influence of different demand distributions and coefficients of variations.   uniform distributions, respectively. All three of them are skewed to the left (their mode is smaller than the expected value) and allow only for positive demands, therefore there is no need for truncating negative demands as in the case of the normal distribution.
In addition to studying the effects of different demand distributions with the same mean and variance, we also probe the influence of different variable demand variations. Instead of considering a fixed variance a 2 as in section 6.1.1, we analyze the effect, when the stochastic term 1: of the demand function (e.g. (5.0.l)) follows the truncated normal distribution function with mean zero and variance a 2 = c.v. • E[D(p,r,1:)], where c.v. denotes the coefficient of variation. Note that for a constant coefficient of variation c. v. and price p, higher reference prices r result in a higher expected demand and thus a greater demand variation a 2 • Extensive numerical studies show that whatever demand distribution or coefficient of variation is used, again a base-stock-list-price policy is found to be optimal. Thus in the following we only consider the effects on the base-stock and list-price levels. For obtaining figures 6.10 to 6.15, again the base scenario from section 6.1.1 with /Jo = 100, /31 = -20, {32 = -40, a = 0.5, c = 0.5, h = 0.005 and b = 0.4 is presumed. The effect of discounting prospective cash flows is excluded by setting 'Y = 1 . Furthermore, a coefficient of variation c.v. = I is used, where not denoted differently.
We now analyze the last time period of the planning horizon t = 15. Since it is very costly to have unsold inventory on hand after the last time period, the main aim here is to reduce as much as possible the risk of not selling the inventory on stock in the last time period. The higher degree of system uncertainty is -that is, either a high coefficient of variation or a heavy tail distribution, the more the retailer aims to decrease the standard deviation of demand. This can be obtained by reducing the mean demand, since then the standard deviation is reduced by the same proportion. As demand is a decreasing function in price it is to respond to an increase in system uncertainty by increasing prices (see figure 6.11 and 6.15). This in tum results in a decreasing optimal base-stock level (see figure 6.10 and 6.14 ). Furthermore, note that in contrast to section 5.1, list-prices in the last time-period do depend on the demand distribution and variation (see figure 6.11 and 6.15), since now the variation of demand is not fixed but depends on price and reference price.
However, in earlier time periods, the dominating objective is not to clear stock, but to optimize long-term profits. In order not to incur expensive backlogging cost, the aim is to have sufficient inventory in stock. As we discussed above, it is clear that for a heavy tail distribution the risk of high demands is higher than for symmetric distribution functions. Thus the optimal policy is to increase the inventory stock level for a higher degree of system uncertainty (see figure 6.12), which in tum results in lower optimal prices (see figure 6.13 ).

Joint versus sequential optimization
Research such as Whitin ( 1955) has already shown that the simultaneous determination of price and ordering or production quantity can yield substantial revenue increase. The coordination of price and production decisions potentially increases profit and thus results in more efficient supply chains. In this section we shall explore the size of possible benefits when using joint optimization compared to sequential optimization via numerical simulations. In subsection 6.2. l we will give a short review with extensions of the results obtained by Federgruen and Heching (1999), while in subsection 6.2.2 we investigate how benefits change, when reference price effects are included in the model.

Classical operations research models
Consider the following ad-hoc, but not unrealistic, mode of operation in which the marketing and production decisions are made in stages. The model is decomposed such that marketing seeks to maximize its objective function first and the production decision is made second. The reason for the suboptimality of the separated model is that the two parties, marketing and production, are considering two different objective functions. Since marketing is taking demand fulfillment for granted, its objective reduces to maximizing expected revenues. In contrast, production also takes into consideration inventory costs. Thus the optimal production decision always depends on the actual inventory in stock. Figure 6.16 depicts the gains of jointly determining an optimal price and inventory level versus the sequential procedure, where marketing first determines the profit-optimal price Pi = -({30-{31c)/(2{31) and then the production unit decides on an optimal stocking quantity without having the option of changing the price. The largest benefits of joint optimization are obtained towards the end of, or for a short planning horizon. In contrast to the comparisons in Federgruen and Heching ( 1999), who base all numerical results on a coefficient of variation, we always use a constant standard deviation in this section (not depending on price). This erodes a lot of the benefit when using a dynamic pricing model. Figure 6.17 shows relatively low benefits for low stock before ordering, which can be much higher for substantially larger inventory levels before ordering. The closer we get to the end of the planning horizon, the earlier this effect can be observed. This is intuitive as the seller tries to reduce the risk of being left with unsold stock at the end of the planning horizon.

. ,yy(xr)]
Optimal total expected profit: seqVy(x1, r1) I Optimal total expected profit: joint v;(x1, r1) I Figure 6.18.: Sequential optimization of price and inventory vs. joint optimization Figure 6.18 describes that in a sequential approach first the marketing/sales department determines an optimal price without considering any inventory decisions and taking demand fulfillment for granted. It is clear that the optimal price p* ( r) in the decomposed model only depends on reference price r . This price is then passed on to the production unit of the company, which then decides on an optimal stocking quantity without being able to change the price. Here the optimal stocking quantity y* ( x) of course only depends on the inventory level before ordering x. In the joint approach, both decisions are taken simultaneously and thus the optimal price p* ( x, r) and the optimal stocking quantity y* ( x, r) are both a function of inventory x and reference price r . Hence, with this better possibility of reacting to the system dynamics, it is obvious that a simultaneous optimization yields higher profits than a sequential procedure although the sequential approach is already highly sophisticated by incorporating non-stationary prices, which vary over time.
To obtain figures 6.19 to 6.24, the parameter set in table 6.1 is used. In figure 6.19 and 6.20, we compare the optimal base-stock and price/reference price paths for the sequential and joint approach from figure 6.18. For the time being, we assume r1 = p'oc, and x 1 = 0 to avoid having a transient phase at the beginning of the planning horizon. Furthermore, we assume that the actual demand realization D(Pt, Tt, Et) = E[D(Pt, Tt, t 1)] in any time period t. Using joint optimization, price p; and base-stock y; leave their steady states later in time.
This is because we have the opposing strategies of benefitting from the reference effects by lowering prices towards the end of the planning horizon (see figure 3.3) and aiming at a clearance of stock at the end of the planning horizon (see figure 2.8). The question now is how such a joint optimization of price and inventory increases the benefits over the sequential optimization. Figure 6.21 shows that similar to figure 6.16, we get the largest benefits of joint optimization towards the end of, or for a short planning horizon. In contrast to the comparisons in Federgruen and Heching ( 1999), who base all numerical results on a coefficient of variation, we always use a constant standard deviation (not depending on price). As in subsection 6.2.1, this generally results in relatively low benefits. Again, higher benefits can be obtained for substantially larger inventory levels before ordering (see figure 6.22 for the last time period t = 50 ). Similar to figure 6. 17, this effect can be observed for any time period t . However, in comparison to the model without reference effects (compare subsection 6.2.1), for smaller t this effect only appears for inventory levels before ordering much higher than 200. This is because the pricing strategy under reference price effect enables us to clear higher stock levels in later time periods.   Figure 6.23 shows that the benefit of the joint model with reference effect is at least 10 times the benefit of the model without reference effect, and is considerably higher when the reference effect increases. While in the classical setting, price is only varied to control inventory, here price has its own dynamics, and incorporating the influence of the reference price increases the benefits of integrating pricing and inventory control significantly. Moreover, a significant difference to the model of subsection 6.2.1 is that while there the benefit converges to zero for long planning horizons, here the benefit converges to a value considerably higher than zero (depending on the parameters chosen). This effect is more prominent the more the starting reference price differs from the optimal steady state price (see figure 6.24).

Extensions
In this section we will give some extensions that are worth considering, but beyond the scope of this thesis. As they could be a subject of further research, we have chosen to give a preliminary idea of the results and difficulties here. We are going to address the issue of lossaverse and loss-seeking customer behavior and then give a brief description of what happens when fixed ordering costs are included in the model.
For loss-averse customer behavior, where consumers respond more to surcharges than to discounts ( /32 < (33 ), we already found in section 5.1 that a base-stock-list-price policy is optimal (see figures 6.25 and 6.26 for the last time period t = 15 ). However, we could also show in section 5.1, that although the optimal price is increasing in reference price, this no longer holds true for the optimal inventory level. We can see in figures 6.27 and 6.28 that this behavior extends to any other time period. Since the optimal price path is monotonous over time according to Popescu and Wu (2007) (compare section 3.3.2), the optimal price thus converges monotonously to a steady state, which depends on the initial reference price level (see figure 3.6). Thus there are grounds for the supposition that the same is true for the base-stock level.
In the case of loss-seeking customers, where the demand function is deeper for gains than for losses and consumers stockpile when prices are low ( /32 > /33 ), we can see by the example of figures 6.29 and 6.30 that a base-stock-list-price policy is again optimal. In contrast to the above discussed loss-averse customer behavior, the base-stock and list-price levels not only lose their monotonicity in reference price, but also continuity. As already described in section 3.3.2, the jump discontinuity in the optimal price results in cycling policy over time and thus it stands to reason that the optimal stocking quantity will also cycle over time.
As a last extension we consider the case where the ordering costs also include a fixed cost component for loss-neutral customer behavior. In contrast to section 6.1.1 we find that here a simple base-stock-list-price-policy is not optimal. Chen and Simchi-Levi (2004a) have already shown for an integrated pricing and inventory control model without reference prices, that in the case of fixed ordering costs an ( s, S, p) -policy is optimal: If the inventory level at the beginning of period t is below the reorder point, St, an order is placed to raise the inventory level to the order-up-to level, St , and a price Pt is charged. Otherwise no order is placed and a different price Pt(x) is offered, which is decreasing in inventory level x. Figures 6.33 and 6.34 show that this result extends to the integrated model including reference price effects. In the case of included fixed ordering costs, in contrast to lossseeking customer behavior, a jump discontinuity happens in inventory level x instead of in reference price r (see figures 6.35 and 6.36).

Summary, Conclusion and Future Research
This thesis addressed the problem of simultaneously determining a pricing and inventory replenishment strategy by combining two literature streams: the operations orientated stream and the marketing orientated stream. In order to benefit from the strengths of both research areas, we combined the rich costs models, commonly used in operations research, with sophisticated demand models, mainly applied by marketing so far. The integration of the consumers' willingness to pay with pricing and inventory control models increases the dimension of state space of the underlying dynamic program, which substantially increases the model's complexity. Within this context we studied how the additional dynamics affect an optimal policy and whether a simple policy such as a base-stock-list-price policy still holds in such a setting.
For the one-period case we could analytically prove the optimality of a base-stock-listprice policy under very general conditions. Furthermore, we showed additional structural properties in state space, describing the consumers' willingness to pay. However, due to the added complexity of the model, an extension even to the two-period version of the problem evoked major complications in analytical tractability, since the value function is no longer 'well behaved' and thus commonly used proof techniques could not be applied. With tedious and extensive mathematical investigations, for the linear and loss-neutral demand function we proved the optimality of a base-stock-list-price policy in the two-period setting. We also suggested a way of proving the optimality of a base-stock policy for the multi-period case, which only holds under very restrictive assumptions. However, we were able to give useful and explicit steady-state solutions for the multi-period setting, provided that such a steady-state exists. Extensive numerical studies suggest that the optimal solutions converge relatively quickly in time (for reasonable parameter settings, a convergence could even be observed within fifteen time periods).
Using numerical simulations, we extended the results obtained analytically to more general settings, such as a larger planning horizon, more general demand functions, or a more complex cost structure. Moreover, we investigated the potential increase of profit by simultaneously determining optimal prices and stocking quantities compared to a sequential optimization, where prices are set first by the marketing department of a company and then the production unit decides on the optimal stocking quantity, without being able to change prices. We found that the benefits increase considerably when reference price effects are included in the model. By using constant standard deviations of demand, we achieved at least Lisa Gimpl-Heersink -978-3-631-75380-4 Downloaded from PubFactory at 01/11/2019 05:41:29AM via free access CHAPTER 7. SUMMARY, CONCLUSION AND FUTURE RESEARCH ten times the benefit attained by a joint model without reference prices, which makes an integration of pricing and inventory control with reference price effects by all means worth the effort.
Substituting y1 by y1 + 8 lets us approach the kink from the right side with respect to y1 . Similarly to above x2(y1 + 8,p1 , r 1 • Ei) becomes