This article is based on ideas originally published by VoxEU – Centre for Economic Policy Research (CEPR) and has been independently rewritten and extended by The Economy editorial team. While inspired by the original analysis, the content presented here reflects a broader interpretation and additional commentary. The views expressed do not necessarily represent those of VoxEU or CEPR.
This article is based on ideas originally published by VoxEU – Centre for Economic Policy Research (CEPR) and has been independently rewritten and extended by The Economy editorial team. While inspired by the original analysis, the content presented here reflects a broader interpretation and additional commentary. The views expressed do not necessarily represent those of VoxEU or CEPR.
This article was independently developed by The Economy editorial team and draws on original analysis published by East Asia Forum. The content has been substantially rewritten, expanded, and reframed for broader context and relevance. All views expressed are solely those of the author and do not represent the official position of East Asia Forum or its contributors.
This article is based on ideas originally published by VoxEU – Centre for Economic Policy Research (CEPR) and has been independently rewritten and extended by The Economy editorial team. While inspired by the original analysis, the content presented here reflects a broader interpretation and additional commentary. The views expressed do not necessarily represent those of VoxEU or CEPR.
This article is based on ideas originally published by VoxEU – Centre for Economic Policy Research (CEPR) and has been independently rewritten and extended by The Economy editorial team. While inspired by the original analysis, the content presented here reflects a broader interpretation and additional commentary. The views expressed do not necessarily represent those of VoxEU or CEPR.
This article is based on ideas originally published by VoxEU – Centre for Economic Policy Research (CEPR) and has been independently rewritten and extended by The Economy editorial team. While inspired by the original analysis, the content presented here reflects a broader interpretation and additional commentary. The views expressed do not necessarily represent those of VoxEU or CEPR.
Keith Lee is Professor of AI and Finance at the Gordon School of Business, Swiss Institute of Artificial Intelligence (SIAI). His primary research lies in financial mathematics and AI-driven computational science, with a focus on quantitative modeling of complex economic and financial systems. His work integrates machine learning, stochastic modeling, and data-centric methods to study structural transformations in markets and institutions.
In recent years, his research has extended to the economic and fiscal implications of technological change, including the interaction between artificial intelligence, demographic shifts, and public finance sustainability.
He holds a PhD in Mathematical Finance from Boston University, and previously earned an MSc in Finance and Economics from the London School of Economics. He completed his undergraduate studies in Economics at Seoul National University under the Korea Foundation for Advanced Studies scholarship program.
He regularly contributes analytical essays on the broader socioeconomic implications of AI to The Economy Review.
Picture
Member for
1 year 8 months
Real name
Keith Lee
Bio
Keith Lee is Professor of AI and Finance at the Gordon School of Business, Swiss Institute of Artificial Intelligence (SIAI). His primary research lies in financial mathematics and AI-driven computational science, with a focus on quantitative modeling of complex economic and financial systems. His work integrates machine learning, stochastic modeling, and data-centric methods to study structural transformations in markets and institutions.
In recent years, his research has extended to the economic and fiscal implications of technological change, including the interaction between artificial intelligence, demographic shifts, and public finance sustainability.
He holds a PhD in Mathematical Finance from Boston University, and previously earned an MSc in Finance and Economics from the London School of Economics. He completed his undergraduate studies in Economics at Seoul National University under the Korea Foundation for Advanced Studies scholarship program.
He regularly contributes analytical essays on the broader socioeconomic implications of AI to The Economy Review.
Sleep. It's something we all need but often take for granted. As people start to realize just how important it is for our health and well-being, the question of how we can detect and understand our sleep states becomes more critical. This paper takes a closer look at that question, breaking it down into five key sections that will guide us toward better solutions and deeper understanding.
The paper starts by looking at accelerometer data for sleep tracking. This method is popular because it’s non-intrusive and works well with wearable devices for continuous monitoring. It explains how Euclidean Norm Minus One (ENMO, standardized acceleration vector magnitude)-based metrics can be a simple alternative to complex medical exams. Next, it reviews current research, highlighting the strengths and weaknesses of different methods. It also points out gaps in the accuracy and reliability of existing models.
Building on the insights gained from the review, the paper then addresses specific challenges, such as sleep signal variability and irregular sleep intervals. It outlines data preprocessing techniques designed to manage these issues, thereby improving the robustness of sleep state detection. To achieve this, a novel likelihood ratio comparison methodology is introduced, which aims to increase generalizability, ensuring effectiveness across diverse populations. Lastly, the paper concludes by acknowledging the limitations of the current study and proposing future research directions, such as incorporating additional physiological signals and developing more advanced machine learning algorithms.
Sleep Tracking Based on Accelerometer Data
According to the National Health Insurance Service, 1,098,819 patients visited hospitals for sleep disorders in 2022, a 28.5% increase from 855,025 in 2018. As the number of sleep disorder patients rises, interest in high-quality sleep is also growing. However, since the causes and characteristics of sleep disorders vary among patients, there is a burden of needing different treatment methods and diagnostic tests.
Patients suspected of having sleep disorders usually undergo detailed diagnosis through polysomnography. This test involves various methods, including video recording, sleep electroencephalogram (EEG, using C4-A1 and C3-A2 leads), bilateral eye movement tracking, submental EMG, and bilateral anterior tibialis EMG to record leg movements during sleep.
Polysomnography has its limitations. Patients must visit specialized facilities, and it's only a one-time session. As a result, there's increasing demand for tools that offer more convenient and continuous sleep monitoring.
Measuring Movement Using Accelerometer Data
Recently, health management through wearable devices has become increasingly common, enabling real-time data collection. Wrist-worn watches can monitor activity levels, and for sleep measurement, both an accelerometer sensor and a photoplethysmography (PPG) sensor are typically used.
The accelerometer sensor tracks body movements, while the PPG sensor uses light to measure blood flow in the wrist tissue, which helps measure heart rate. Although using data from both sensors could improve the accuracy of sleep measurement, this study only uses accelerometer data due to limitations on data usage. The reasons for this decision will be explained further on.
The accelerometer data consists of three axes, as shown in the figure below [1].
The $x$-axis represents changes in the direction that moves horizontally to the ground, the $y$-axis shows changes in the lateral direction of movement (e.g., how much the arms swing to the sides), and the $z$-axis indicates changes in the vertical direction of movement (peaking when the legs cross over during walking). It is important to understand that each axis's function depends on the sensor's reference axes. If these reference axes change, movement is usually measured based on the axis that shows the biggest change in values.
The graph below shows an example of 3-axis data [4]. This graph shows how the measurements change when walking with the arms swinging compared to walking with the arms held still. The changes in the $x$, $y$, and $z$ axes represent changes in the mean values, and it can be seen that the signal shown in green, when the arms are fixed, has the most significant variation. Therefore, accelerometer data can vary for the same action if the sensor's position or reference axes change.
Making Useful Variables Through Transformation
To solve the problem of axes changing when the sensor's orientation shifts, it is important to convert the data into straightforward yet informative variables. Many studies used summary metrics (or summary measurements). This combines the $x$, $y$, and $z$ axis values into a single value, thereby reducing the impact of changes in sensor orientation.
Examples of summary metrics include Euclidean Norm Minus One (ENMO, standardized acceleration vector magnitude), Vector Magnitude Count (VMC), Activity Index (AI), and Z-angle (wrist angle). Let’s take a closer look at ENMO and Z-angle, as they relate to the signal data from wearable devices discussed earlier.
As shown in the accelerometer diagram in Figure 1, when interpreting the dynamic acceleration of sensor data, it's important to consider the effect of gravity (g). Therefore, to standardize the linear transformation of the three-axis values, a variable that subtracts gravitational acceleration is referred to as the ENMO variable. This can be expressed mathematically as follows.
The Z-angle is a summary metric for the wrist angle, which can be considered as the angle of the arm relative to the body's vertical axis. It can be expressed using the following formula.
To gain an intuitive understanding, let's look at the actual ENMO values measured in real-life situations. Figure 3 below is a table summarizing the ENMO measurements during daily activities [2].
When standing, the ENMO was measured at an average of 1.03g, while it increased to 10.3g during everyday walking. This clearly demonstrates that the ENMO value is lower with minimal movement and rises as activity levels increase. In other words, since humans do not always move at a constant speed like robots, activity levels can be measured using acceleration.
While it may appear that raw $x$, $y$, and $z$ axis data offers more information due to its detail, this study seeks to demonstrate that condensing this information into a single summary metric doesn't significantly impact our ability to accurately estimate sleep states.
Additionally, a basic model revealed that excluding Z-angle data does not result in significant information loss. When we used a tree model to evaluate the explanatory power of variables with statistical metrics from both Z-angle and ENMO, the ENMO variables were found to be much more important. In fact, all of the top 10 most important variables were related to ENMO. Since the importance of Z-angle variables was significantly lower, this study will focus on using ENMO as the primary basis for addressing the problem.
Review of Previous Studies
Existing Methodologies Focused on Optimization
Earlier, we explored several summary measurement variables, such as ENMO, VMC, AI, and Z-angle. More recently, research has been focused on identifying new summary measurements, like MAD (Mean Absolute Deviation), using axis data collected from accelerometer sensors. This kind of variable transformation requires advanced domain knowledge, and the process of validating these reduced variables is complex.
In previous studies, various summary measurements were investigated, and temporal statistics—such as overlapping or non-overlapping deviations, averages, minimums, and maximums at one-minute intervals—were used for classification or detection through machine learning models, heuristic models, or regression models. Figure 4 below summarizes the key methodologies from previous research [5].
Additionally, the evaluation metrics used in sleep research are as follows [2]:
Sensitivity (actigraphy = sleep when PSG = sleep)
Specificity (actigraphy = wake when PSG = wake)
Accuracy: total proportion correct
The amount of Wakefulness After Sleep Onset (WASO): the total number of awakenings during the sleep period
SE (Sleep Efficiency): the proportion of sleep within the periods labeled by polysomnography
TST (Total Sleep Time): calculated as the sum of sleep epochs per night
Limitations of Optimization and Increased Sensitivity to Changes in Data Patterns
Machine learning models like Random Forests and neural networks, such as CNNs (Convolutional Neural Networks) and LSTMs (Long Short-Term Memory networks), are considered "high complexity" due to their focus on achieving high accuracy. This often results in having a large number of parameters, which increases the risk of overfitting. When overfitting occurs, the model might learn the noise in the training data instead of the actual patterns, especially if there isn't enough data.
As a result, the model's performance can decline when applied to new dataset. In practical research, therefore, these high-complexity models sometimes struggle to accurately detect the exact moments of falling asleep or waking up. By focusing too much on optimization, they have overlooked the importance of generalization.
Are simpler models, like regression models, free from optimization issues? Although regression models are generally less sensitive to noise, they rely on the assumption that the data follows a normal distribution. If this assumption is not met, the standard error of the correlation coefficients can be high relative to the coefficients themselves. This increases the p-value, reducing the significance of the correlation and making the model's results less reliable.
Since sleep data often does not follow a normal distribution, additional optimization is needed for regression models like the Cole-Kripke[3] and Oakley[6] models. While these simpler models may be less accurate with target data compared to machine learning or neural network models, their low complexity and optimized adjustments make them useful as baseline models in research, often used alongside polysomnography.
When users have only recently started using wearable devices, there is often a need to classify sleep states with limited data. Early data may lack representativeness, making it challenging to rely on data-intensive models from the machine learning or deep learning fields. Traditional regression models that require extensive optimization are also not sustainable in these cases. This challenge becomes even more significant when analyzing data from multiple users rather than just one. Therefore, this study aims to introduce data transformation and model transformation methodologies that can improve generalization performance.
Characteristics and Collection Methods of ENMO Data
Before diving into the detailed data preprocessing steps and methodologies that aim to overcome the limitations of previous research, let’s take a closer look at the characteristics of ENMO data.
ENMO signals are collected at 5-second intervals and can be analyzed in combination with sleep state labels assigned through sleep diaries. The criteria for labeling sleep states in the sleep diary are as follows:
Sleep is assumed if the sleep state persists for at least 30 minutes.
The longest sleep period during the night is recorded as the sleep state. However, there is no rule limiting the number of sleep episodes that can occur within a given period. For example, if an individual sleeps from 1:00 to 6:00 and again from 17:30 to 23:30 on the same day, both sleep periods are valid and counted. This approach naturally accommodates different sleep patterns, such as early morning and evening sleep, which can be influenced by work schedules.
To help with understanding, let's take a look at the sample data in the graph below. This data was collected over approximately 30 days from one individual, specifically looking at their Z-angle and ENMO signals. Sleep periods are marked as 0, active periods as 1, and -1 indicates cases where the label values in the sleep diary are missing due to device or recording errors.
As expected, we can observe a noticeable periodicity as the sleep periods (0, in red) and active periods (1, in green) alternate. While not everyone exhibits the same sleep pattern, the overarching cycle of sleep and awake remains consistent. Therefore, this study will focus on using generalized data transformations to better distinguish between sleep and wake cycles. In the following section, we will introduce modeling methods that prioritize generalization.
The Z-angle data also showed a cyclic pattern. However, as mentioned earlier, ENMO data is significantly more important than Z-angle data and results in less information loss. Therefore, in the following methodologies, only the ENMO variable was used.
Considering Variability of Sleep Signals and Irregularity of Sleep Intervals
Interestingly, even during sleep, there are small fluctuations. This occurs because sleep consists of different stages, as many people know. These stages are usually classified based on the criteria shown in the diagram below.
In the previous study by Van Hees, sleep stages were classified based on the same diagram. The concept of sleep stages suggests that body movements vary depending on the stage, which can cause subtle fluctuations in sleep signals. As shown in the ENMO data in Figure 5, the $y$ values during sleep periods (indicated by red bars) are not uniform.
It naturally occurred to me that if the signals during sleep periods could be processed into more consistent signals, detecting sleep states would become easier. The goal of stabilization is not to eliminate sleep signals entirely but to preserve their characteristics while maintaining relatively stable values compared to the variance in the raw data.
Building on this idea, we can conclude that generalization is achievable even when the amount of tossing and turning varies between individuals and across all users. While it might be tempting to skip over these complex processes, doing so would be unwise. Previous studies have often overemphasized optimization, which can lead to problems like overfitting. To prevent this, it’s well-known that regularization techniques, such as the Lagrange multiplier method, are commonly used.
In this study, we aimed to develop a methodology with superior generalization performance by processing the data based on insights gained from a more detailed analysis of the data characteristics and modifying the model accordingly. I hope this discussion helps convey the importance of having a solid rationale in the data preprocessing stage to build a reliable model.
Stabilizing Sleep Signals Through Data Transformation
The initial approach to stabilizing sleep signals focused on removing outliers by applying standard filtering techniques. A method similar to Fast Fourier Transform (FFT) was employed, specifically using Power Spectral Density (PSD). PSD is effective for analyzing the distribution of $FFT_2$ density across different frequency bands.
However, after applying PSD to the ENMO signal data, we found that 99.8% of the entire dataset remained, failing to achieve the intended stabilization of sleep-specific signals. As shown in Figure 6, the variability of the processed ENMO signals (indicated by the red bars) during sleep periods (the area between the red and green lines, in that order) remained evident.
To see if smoothing the data would resolve the issue, a Kalman filter was applied. Despite incorporating covariance from previous data, it failed to stabilize the sleep signals. As shown in Figure 7, much of the variability in the processed ENMO signals during sleep periods persisted. Additionally, the Kalman filter performed poorly in detecting sleep states and had higher computational costs compared to PSD, mainly due to the use of covariance from prior data.
Finding Periodicity in Irregular Intervals
There was an aspect that was overlooked during the initial data transformation process. We missed one of the most important characteristics of the given data: even for a single user, the times of falling asleep and waking up are not consistent. Therefore, this time we used the Lomb-Scargle periodogram, a method designed to detect periodic signals in observations with uneven spacing.
Figure 8 below visualizes the data after applying the Lomb-Scargle periodogram. Although the signals in the sleep periods appear almost uniform due to the long duration of the entire dataset, zooming in on specific intervals reveals that while the variability has been reduced, the characteristics of the signals have been preserved as much as possible.
Beyond applying this to a single ID value, we also examined the results for all IDs without missing data. The dominant frequency showed a linear pattern with power. Therefore, for frequencies observed within the regression line, we determined that filtering using the typical values within the linear range would not significantly reduce the accuracy of the predictions.
Signal processing methods like the Lomb-Scargle periodogram, which align with the principles of FFT, need sufficient data to detect periodic patterns. In the case of ENMO data, at least a full day must pass to observe a complete cycle of sleeping and waking.
Thus, if the sample period for each ID was less than 3–5 days or there were many device omissions, filtering was done using the dominant frequency data from the training data. When there was at least about 5 days of data available, filtering was applied individually based on each ID.
Likelihood Ratio Comparison
In the previous section, we explored how data can be transformed as a method of generalization. In this section, we will examine how model transformation can improve generalization performance.
Sleep and Awake Period Distributions
When examining the ENMO signal data after applying the Lomb-Scargle periodogram, neither the awake nor the sleep period data exhibited a uniform distribution.
As shown in Figure 10, the distribution shapes of the two periods also differ. Notably, there is a distinct difference in the shape of the distribution peaks: the joint distribution peak during sleep periods forms a smooth curve, whereas the peak during awake periods appears more angular.
Interestingly, upon closer inspection, despite the different distribution shapes, the peak values for each ID are clustered around 0 on the x-axis, whether it’s during sleep or awake periods.
Similarly, in both Figure 11a (entire dataset) and Figure 11b (9% of the entire dataset), the peak values of the sleep and awake distributions did not change significantly. Additionally, the peak values in Figure 12, which shows 800 randomly sampled observations from Figure 11b, also showed minimal variation.
Using Sleep and Awake Period Distributions
We examined whether the peak values of the distribution functions were different or the same. This was part of an effort to apply a likelihood ratio (LR) comparison method by utilizing the distribution information of sleep and awake periods to generalize the sleep state detection method. If the distributions are known, approaching the problem using Maximum Likelihood Estimation (MLE) is the most appropriate method. Similarly, we aimed to model based on the Likelihood Ratio (LR) by using the information from the sleep and awake distributions.
Sleep and awake distributions may not follow commonly known probability density functions (e.g., Gaussian, Poisson, etc.) and are often irregular. As an alternative, we used distributions derived from kernel density estimation (KDE). Kernel density estimation involves creating a kernel function centered on each observed data point, summing these, and then dividing by the total number of data points. Typically, the optimal kernel function is the Epanechnikov kernel, but for computational convenience, the Gaussian kernel is frequently used. In this study, we also used the Gaussian kernel.
First, let's explain how the LR method was applied using equations. $LR = \frac{L_ {1} (D)}{L_ {0} (D)}$.
The likelihood ratio can be calculated for each data input point, where $L_ {0} (D)$ represents the likelihood of the data under the null hypothesis, indicating a higher probability of being a sleep signal. Conversely, $L_{1} (D)$ represents the likelihood under the alternative hypothesis, indicating a higher probability of being an awake signal. If the LR is greater than a threshold, it suggests that the data is more likely under the alternative hypothesis (awake signal).
Figure 13 visualizes the results of sleep state detection using the above generalization methodology for a single ID. In the graph below, you can see that the lowermost graph finds as many points as possible between the first activity signal point (moment of waking up) and the last point (moment of falling asleep) after data transformation.
From a computational efficiency standpoint, the likelihood ratio (LR) method is also advantageous. When measuring computation time, it was observed that data transformation occurs simultaneously with data input, allowing the LR results to be produced quickly. Processing 39,059 data points all at once took about 7 seconds. For one day's worth of data for 10 users (17,280 data points), it took around 1 minute and 40 seconds in total.
As expected, this method, which depends on distribution, doesn’t detect cases where the device is missing. However, visual checks using Figure 13 showed that the method works well in detecting signals without label values, as long as the device wasn’t missing.
To assess the robustness of the model, this study proposes using the time difference between predicted values and label values as a performance metric. Since the likelihood ratio method introduced above focuses on generalization, we determined that traditional evaluation metrics from existing sleep research, which are geared toward optimization, would not be applicable.
New Evaluation Metric for Assessing Model Robustness
When comparing the time difference between the model's predicted values and the label values, the model tends to predict the moment of falling asleep earlier than the actual label values and the moment of waking up later than the actual labels. To understand the cause of this, we applied the LR method to the raw, unprocessed ENMO signals to detect sleep states.
As shown in Figure 14a, even when using unprocessed ENMO signals, the tendency for predictions to be early or late remained the same. This suggests that these tendencies are inherent to the collected ENMO signals themselves. It is expected that if additional data, such as pulse rate or other complementary information, is used in the future, the time difference (time diff) could be reduced.
Limitations and Future Research Plans
In the previous section, we briefly discussed assessing the performance of the likelihood ratio comparison method by using the time difference between predicted values and label values. To provide a more objective evaluation, we will now also examine the results of applying this method to IDs that were not included in the training data (test set).
The numbers shown above are the average of the standard errors of time differences (within individual standard error, SE) calculated for each ID. Figure 15a shows the results of applying the likelihood ratio comparison model to 10 randomly selected IDs included in the training data, while Figure 15b shows the results using 3 randomly selected IDs that were not included in the training data.
Verifying the Robustness of the Likelihood Ratio Comparison
The results showed that the processed ENMO signals did not exhibit significant performance differences between the training and test data sets, reaffirming the robustness of the generalization-focused methodology. Although the processed ENMO signals displayed understandable levels of variability, the unprocessed original ENMO signals showed a significant increase in average standard error.
Additionally, Figures 15a and 15b highlight the contribution of data transformation to performance improvement. The original ENMO signals, without any preprocessing, had a higher average standard error compared to the processed ENMO signals. This difference was more pronounced in the test set (Figure 15b), where the average standard error for the processed ENMO signal data was reduced by over 20 minutes for both wakeup and sleep onset times. This underscores the importance of investing effort into data preprocessing to enhance generalization performance.
The training set's performance was validated using random samples from 10 different IDs, while the test set included three additional randomly selected IDs not used in the training distribution, further rigorously testing the model's generalization performance. For reference, the total number of nights analyzed across the 13 IDs was approximately 110 days, suggesting that a sufficiently long period was utilized for comparing average standard errors.
Key Research Findings
In summary, this study focused on generalization rather than optimization. Efforts were made to enhance generalization performance, starting from the data transformation stage. Using raw data statistics based on accelerometer data can increase dimensionality and make the data more susceptible to outliers and noise, highlighting the need for data preprocessing. Additionally, since sleep patterns are not consistent, it was necessary to stabilize the data by applying the Lomb-Scargle periodogram, which can detect periodicity in unevenly spaced data.
From a modeling perspective, rather than enhancing the fit for each individual data point as is common in traditional machine learning or deep learning models, this study utilized distribution data rich in information. Distributions contain more information than variance, leading to a structure that is inherently more efficient from a modeling standpoint. As a result, even users who have only recently started wearing wearable devices can benefit from early detection (though at least one hour of data is needed), improving the practical utility of the device.
Furthermore, the LR method offers the advantage of high computational efficiency. Compared to complex models like machine learning or deep learning, as well as traditional models using rolling statistics, the computational efficiency of the LR method is significantly higher. In the same vein, the LR method is easier to maintain. With its lower model complexity and sequential execution of data preprocessing and LR model inference stages, subsequent modifications to the model structure are also straightforward.
Future Research
Currently, only ENMO signal data is used, but incorporating more supplementary variables (e.g., heart rate) is expected to make sleep state detection more refined. Enhancing performance may also be possible by implementing more detailed updates during data preprocessing for each individual ID. In this study, the period for allowing the use of past distribution data and determining the dominant frequency was chosen through basic experiments, but future studies could consider more precise adjustments.
The heterogeneity that exists between individuals should also be considered. Future studies could achieve higher accuracy by analyzing different groups (e.g., those with above-average activity levels vs. those with minimal activity) rather than simply adjusting the current threshold values. Expanding the study population could also contribute more to public healthcare research by reflecting demographic characteristics among individuals, which would be valuable for both business and sleep research perspectives.
Continually expanding the variety of data has great potential for advancing sleep research. For example, the Healthy Brain Network, which provided the data used in this study, aims to explore the relationship between sleep states and children's psychological conditions. This highlights the increasing importance and interest in using sleep state measurements as a supplementary tool for understanding human psychology and social behavior.
Meaningful Inference Amid Uncertainty
Understanding complex issues depends on how the available information is utilized. The data used in this study are signal data, and most signal measurements inherently contain noise, which introduces uncertainty. Moreover, understanding sleep states requires domain expertise, and direct measurement is often difficult. Despite these difficulties, this research made significant efforts to predict human sleep states through indirect measurements or partial observations, aligning with recent advances in wearable devices.
In conclusion, optimization and generalization are naturally in a trade-off relationship. While this paper focused on generalization, the emphasis on optimization should be adjusted dynamically based on how much precision is required from a business perspective. Just as the phrase "one size fits all" is contradictory and almost impossible to achieve perfectly, it is important to recognize that choices must be made depending on the data and the specific context.
[2] Kishan Bakrania, Thomas Yates, Alex V Rowlands, Dale W Esliger, Sarah Bunnewell, James Sanders, Melanie Davies, Kamlesh Khunti, and Charlotte L Edwardson. Intensity thresholds on raw acceleration data: Euclidean norm minus one (enmo) and mean amplitude deviation (mad) approaches. PloS one, 11 (10):e0164045, 2016. 4
[3] Roger J. Cole, Daniel F. Kripke, William Gruen, Daniel J. Mullaney, and J. Christian Gillin. Automatic sleep/wake identification from wrist activity. Sleep, 15(5):461–469, 09 1992. ISSN 0161-8105. doi: 10.1093/sleep/15.5.461. URL https://doi.org/10.1093/sleep/15.5.461. (document)
[4] Marta Karas, Jiawei Bai, Marcin Str´aczkiewicz, Jaroslaw Harezlak, Nancy W. Glynn, Tamara Harris, Vadim Zipunnikov, Ciprian Crainiceanu, and Jacek K. Urbanek. Accelerometry data in health research: challenges and opportunities. bioRxiv, 2018. doi: 10.1101/276154. URL https://www.biorxiv.org/content/early/2018/03/05/276154. (document), 3
[5] Miguel Marino, Yi Li, Michael N. Rueschman, J. W. Winkelman, J. M. Ellenbogen, J. M. Solet, Hilary Dulin, Lisa F. Berkman, and Orfeu M. Buxton. Measuring sleep: Accuracy, sensitivity, and specificity of wrist actigraphy compared to polysomnography. Sleep, 36(11):1747–1755, 11 2013. ISSN 0161-8105. doi:10.5665/sleep.3142. URL https://doi.org/10.5665/sleep.3142. (document)
[6] Nigel R Oakley. Validation with polysomnography of the sleepwatch sleep/wake scoring algorithm used by the actiwatch activity monitoring system. mini mitter co. Sleep, 2:0–140, 1997. (document)
[7] Matthew R Patterson, Adonay AS Nunes, Dawid Gerstel, Rakesh Pilkar, Tyler Guthrie, Ali Neishabouri, and Christine C Guo. 40 years of actigraphy in sleep medicine and current state of the art algorithms. NPJ Digital Medicine, 6(1):51, 2023. 5
I am in my early 40s and work at an office near Magok Naru Station and I live near Haengsin Station in Goyang City. I used to commute by company shuttle, but recently I've taken up cycling as a hobby and now commute by bike. The biggest reason I got into cycling was because of the positive image I had of Seoul's public bicycle program, Ddareungyi.
What Sparked My Interest
One day, I stepped off the shuttle, rubbing my sleepy eyes, and was surprised to see hundreds of green bikes clustered together. I hadn't noticed them before, probably because I’m usually too tired as an office worker, not paying much attention to my surroundings once I get to work. Or maybe it’s just because I’m so groggy in the mornings that the bikes slipped past me. Either way, the sight took me by surprise.
Most crowded areas
I often wondered where the many Seoul bikes at the Magok intersection came from. This also sparked my interest in Ddareungi and made me think about researching public bike programs as a topic for my thesis.
As I continued to develop my thoughts, I suddenly wondered, "Is there really a place that uses bicycles more than Magok?" A quick internet search provided the answer. According to the "2022 Traffic Usage Statistics Report" published by the Seoul Metropolitan Government, the district with the highest use of public bicycles (Ddareungi) in Seoul was Gangseo-gu, with 16,871 cases.
Furthermore, according to data released on the Seoul Open Data Platform, the top seven public bicycle rental stations in Gangseo-gu are as follows: ▲ Magoknaru Station Exit 2 with 88,001 cases ▲ Balsan Station near Exits 1 and 9 with 63,166 cases ▲ Behind Magoknaru Station Exit 5 with 59,095 cases ▲ Gayang Station Exit 8 with 56,627 cases ▲ Magok Station Intersection with 56,117 cases ▲ Magoknaru Station Exit 3 with 52,167 cases ▲ Behind Balsan Station Exit 6 with 48,145 cases, etc. I was quite surprised to learn this. The place with the highest use of Ddareungi in Seoul was right here, the Magok Business District, where I commute to work.
During my daily commute, I began to notice more people using bicycles than I had originally thought. Bikes are increasingly viewed as a way to address environmental concerns while also promoting fitness for office workers. Inspired by this trend, I considered commuting by bike myself, like many others in Seoul. However, since I live in a different district, I faced the dilemma of choosing between Goyang City's Fifteen program or Seoul's Ddareungi. During my research, however, I discovered that Goyang City's Fifteen program had been discontinued due to financial losses.
Reasons for Deficits in Public Bicycle Programs
So, I looked into the deficit sizes of other public bicycle programs and found that "Nubija" in Changwon had a deficit of 4.5 billion KRW, "Tashu" in Daejeon had 3.6 billion KRW, and "Tarangke" in Gwangju had a deficit of 1 billion KRW. This showed that most regional public bicycle programs are struggling with deficits. Even Seoul's public bicycle program, "Ddareungi", which I thought was doing well, has a deficit of over 10.3 billion KRW. This made me wonder why public bicycle programs are always in deficit.
At the same time, although Ddareungi is a beloved mode of transportation for the ten million citizens of Seoul, I started to worry whether this program could be sustained in the long run. After looking into the issue, I discovered that the biggest contributor to the deficits in public bicycle programs is the high cost of redistributing the bikes across the city.
For Goyang City, it was estimated that out of a total maintenance budget of 1.778 billion KRW, around 375 million KRW is spent on on-site distribution, and 150 million KRW is used for vehicle operation costs related to redistribution. This means approximately 30% of the total budget goes towards redistribution, making it the largest single expenditure. A similar trend is observed in Changwon City, where redistribution costs also account for a significant portion of the budget. Although this information is not directly about Ddareungi, it suggests that about 30% of the total operating costs of public bicycle programs are likely spent on bicycle redistribution.
This led me to believe that cutting bicycle redistribution costs could be the key to resolving the chronic deficits in public bicycle rental programs. It also made me consider that optimizing redistribution by analyzing Ddareungi users' usage patterns could help reduce these expenses. To achieve this, I needed to analyze the factors influencing rental volume and create a model to predict expected demand, which would help prevent shortages and minimize unnecessary redistribution efforts.
Optimizing Redistribution Through Demand Forecasting
The Ddareungi bike rental data includes bike ID, return time, and station information. To visualize rental volumes by station, additional location data (latitude and longitude) from the Seoul Open Data Plaza was used. Synoptic weather data from the Seoul Meteorological Station was also integrated with the rental records to analyze the impact of weather on bike usage. A detailed analysis of usage patterns was conducted on a four-year dataset (2019-2023) from the Ddareungi station at Exit 5 of Magoknaru Station.
General Usage Patterns
The result showed that bike usage drops with stronger winds and rain but peaks at moderate temperatures (15-17°C). The highest usage occurs during weekday morning and evening commutes. Usage patterns are concentrated in business districts such as Magok, G-Valley, and Yeouido, where most users are in their 20s and 30s. These areas experience imbalances in rentals and returns, especially during commutes.
The general usage patterns were analyzed to forecast bicycle demand and supply. Using the STL (Seasonal and Trend decomposition using Loess) method, rental and return volumes were first decomposed to reveal seasonality, trends, and cycles. The residuals from this decomposition were then applied to a SARIMAX model, incorporating weather and time variables to explain the usage patterns. The model successfully forecasted demand, achieving an R² of 0.73 for returns and 0.65 for rentals.
Optimization Based on the Rental-Return Index Range
To optimize bike redistribution, the "Rental-Return Index" was introduced to measure the difference between expected rentals and returns at each station.
[ 1 \ Day \ Index = \frac{Estimated \ Rental \ Volume}{Estimated \ Return \ Volume} ]
As shown in the equation above, when a station has the right balance, with neither a surplus nor a shortage of bikes, the Index equals 1. An Index greater than 1 indicates a shortage, while an Index below 1 signifies a surplus. By categorizing stations into surplus or deficit, redistribution efforts can be directed toward stations with shortages (Index greater than 1), improving customer satisfaction.
In addition, this approach is particularly useful because the number of redistribution targets can be quantified based on the available budget for Seoul's bike system. Stations with the highest Index values are prioritized first, and the top stations for redistribution are selected according to the allocated budget, ensuring cost-effective and efficient redistribution efforts.
To further optimize bike redistribution, clustering can be applied to group business and residential areas based on rental and return distributions within districts, aiming for a rental-return Index of 1. This method would minimize the distance bikes need to be moved during redistribution, as workers would be assigned to specific teams responsible for managing these clustered regions. In other words, by focusing on areas where the Index is balanced, this approach ensures more efficient redistribution while reducing overall transportation efforts.
Clustering Idea for Implementing Spatial-Temporal Balance
Common Clustering Method
Initially, a K-Means clustering approach was tested to identify areas where the difference between bike rentals and returns was close to zero. By adjusting the number of clusters to match Seoul’s 25 districts, the analysis of June 2023 data showed that clusters with more districts had net volume averages closer to zero, indicating a better balance between rentals and returns. In contrast, smaller clusters with fewer districts exhibited greater imbalance.
Further testing with other clustering methods, such as the Gaussian Mixture Model (GMM), produced results similar to those of K-Means. However, neither method fully captured the underlying bike movement patterns, as these clustering models were unable to account for the dynamic mobility data within the bike-sharing system. This suggested that the algorithms might not be well-suited to the structure of Ddareungi's data, highlighting the need for alternative modeling approaches.
Since Ddareungi’s data reflects bike movements between stations, it is logical to treat these movements as links within a graph, with rental and return stations acting as nodes. By applying a community detection method, clusters can be identified based on the most frequent bike movements. This graph-based approach, which focuses on actual bike movement patterns, could lead to more efficient bike redistribution and yield improved clustering results.
etwork Detection Method
The approach involves treating the movement of bikes between rental and return stations as links between nodes, thereby creating a graph. By identifying clusters with the highest number of links, it's possible to detect community divisions where bikes tend to circulate internally. This can significantly enhance the efficiency of bike redistribution across the network.
This is where network community detection comes into play. Community detection is a method that divides a graph into groups with dense internal connections. Applied to Ddareungi data, it helps track rental-return patterns by clustering areas where rentals and returns are balanced. By identifying these clusters, we can detect regions that maintain spatial balance, with more compact clusters reflecting higher modularity.
Modularity measures how densely connected the links are within a community compared to the connections between different communities. It ranges from -1 to 1, with values between 0.3 and 0.7 indicating the existence of meaningful clusters. Higher modularity signifies stronger internal connections, leading to more effective clustering.
To optimize modularity, the Louvain algorithm was tested. This algorithm works in two phases: In Phase 1, nodes are assigned to communities in a way that maximizes modularity. In Phase 2, the network is simplified by merging the links between communities, further refining the structure and improving cluster detection.
When applied to Ddareungi data, the Louvain algorithm significantly outperformed K-Means clustering, which relies on Euclidean coordinates. The average net deviation, where 0 is ideal, dropped sharply from 21.19 with K-Means to 9.23 using Louvain, indicating a more accurate clustering of stations. Unlike K-Means, which ignores key geographical features like the Han River, the Louvain algorithm took Seoul's geography into account, resulting in more precise and meaningful clusters.
The following map comparison highlights this difference, showing how Louvain provides clearer cluster differentiation across the Han River, whereas K-Means fails to capture these geographic distinctions.
Understanding the Cycle
I likened Ddareungi bike movement to the flow of water. Just as the total amount of water on Earth remains constant, the total number of Ddareungi bikes stays fixed. This analogy helps conceptualize the system as spatially and temporally closed, where clustering can maintain balance.
Temporal imbalances can be managed by tracking the flow of bikes throughout the day. For instance, business districts experience high demand in the morning but accumulate excess bikes by evening, while residential areas face the opposite situation. Redistribution efforts can be minimized by transferring surplus bikes from business districts to residential areas overnight, before the morning commute begins. After the morning rush, bikes concentrate in business districts but are naturally redistributed as users ride them back to residential areas during the evening commute.
Although there is some uncertainty in the evening, as it's unclear whether users will choose bikes for their return journey, any surplus can still be addressed overnight as part of the regular redistribution cycle. This ensures that before the next morning commute, any leftover bikes in business districts are moved to residential areas as mentioned above. When viewed over a full day, these fluctuations tend to balance out, reducing the need for excessive intervention.
To manage these imbalances more effectively, a rental-return index was used to prioritize stations for redistribution, ultimately reducing operational costs. Additionally, network community detection, particularly through the Louvain algorithm, provided more accurate clustering than previous methods. This approach better reflected Seoul's geography, especially by distinguishing clusters across the Han River, greatly improving redistribution strategies.
By viewing Ddareungi as a system striving for both spatial and temporal balance, shortages can be managed more efficiently. This approach not only optimizes the Ddareungi system but also offers valuable insights for enhancing the management of other shared resource systems.
It's difficult to maintain blood stock at safe levels
South Korea has recorded its lowest birth rate in history. In 2023, the country's total fertility rate was 0.72, raising concerns about various future issues. Among them, the potential blood supply shortage due to low birth rates has come into focus. According to the Korean Red Cross, by 2028, the demand for whole blood donations is expected to exceed supply. Moreover, this gap is anticipated to widen further.
Blood shortages have long been a recurring problem. Especially during the winter season, the lack of blood donors causes hospital staff to worry about whether they can ensure a smooth supply of blood to patients. Despite these concerns, the blood shortage problem continues to worsen.
The Korean Red Cross considers a blood stock of more than five days to be at a "safe level", while a stock of less than five days is regarded as a "shortage". However, past data shows that the number of days the blood stock remains at a safe level has been decreasing.
Why is it difficult to maintain blood stock at a safe level? The reason is that both the supply and usage for blood are hard to control. Blood is used in medical procedures like surgeries, and reducing its usage would cause significant backlash. On the other hand, blood can only be supplied through donations, meaning supply is limited. Therefore, despite the efforts of the Korean Red Cross, it remains challenging to keep blood stock at a safe level.
Literature Review
This study aims to understand the dynamics of blood supply and usage to help address the issue of blood shortages. Additionally, the study measures the effects of "blood donation promotional activities", one of the key factors in increasing blood supply, and propose efficient solutions.
Before delving into the analysis, let's review how previous studies have approached blood supply and usage. Blood has the characteristics of a public good, so it's heavily influenced by laws, and blood donation and management systems vary significantly between countries. Therefore, it was deemed difficult to apply research findings from other countries domestically, which is why I focused on reviewing domestic studies.
Yang Ji-hye(2013), Lee Tae-min(2013), Yang Jun-seok(2019), and Shin Ui-young(2021) focused on qualitative analysis by identifying motivations for blood donation participation through surveys. Kim Shin(2015) used multiple linear regression analysis to predict the number of donations by individual donors. However, personal information of donors was used as explanatory variables, and time series factors were not considered, making it difficult to understand the dynamics of blood supply and usage. Kim Eun-hee(2023) studied the impact of the COVID-19 pandemic on the number of donations, but her research had limitations, as it did not account for exogenous variables or types of blood donations. Unfortunately, previous studies did not focus on the dynamics of blood supply and usage, leaving little content to reference for this analysis.
Analysis of Blood Supply Dynamics
Selection of Analysis Subjects
From this section, I will introduce the analysis process. Rather than diving straight into the analysis, I will first clearly define the subjects of analysis. The Korean Red Cross publishes annual blood donation statistics, providing the number of donors categorized by group (age, gender, donation method, etc.). This study utilized that data for the analysis.
There are various types of blood donations. Depending on the method, donations are classified into whole blood, plasma, and platelets & multiple components. First, looking at plasma, approximately 68% of it is used as a raw material for pharmaceutical production, and it has a long shelf life of one year, making imports feasible. Therefore, in the case of plasma shortages, the issue can be resolved through imports, and as such, it is not our primary concern.
Next, platelet & multiple component donation has stricter criteria. Women who have experienced pregnancy are not eligible to donate, and it requires better vascular conditions compared to other types of donations. As a result, the gender ratio of donors is skewed at 20:1, raising concerns about sample bias and making it difficult to derive accurate estimates during analysis. Moreover, unlike whole blood, platelet & multiple component donations are primarily used for specific diseases. For these reasons, this study focuses solely on whole blood donations as the subject of analysis.
After selecting whole blood donations as the subject of analysis, one concern arose: whether to differentiate the data based on the amount of blood collected. The data I received is categorized by 320ml and 400ml amounts. Should I divide the data based on these amounts, just as we divide groups by gender? I decided that it would not be appropriate to make this distinction. Dividing the data by amount would distort the data structure because the amount is not a choice made by the donor but is determined by the donor's age and weight. Since donors cannot choose the amount, the 320ml and 400ml data come from the same distribution, and dividing them would arbitrarily split this distribution. Therefore, in this analysis, I integrated the data categorized by amount of blood collected and defined it as the "number of donors" for the analysis.
The day of the week effect
Now that the analysis target has been clearly defined as the number of whole blood donors, let's begin the analysis. Since the number of donors is time series data, it's important to check whether it shows any seasonality. First of all, it is expected that the number of donors will vary depending on the weekly seasonality, specifically the day of the week and holidays. Let's examine the data to confirm this.
As seen in Figure 3, the number of blood donors is higher on weekdays and relatively lower on holidays. Let's incorporate this information into the model. If the differences between groups in the data are overlooked and not included in the model, omitted variable bias (OVB) may occur, leading to inaccurate results. Therefore, it is important to identify variables that could cause group differences and incorporate them in the model.
It is natural to think that if we are dividing the data by groups, we should also split the data by gender. However, there is no need to group the blood donor data by gender. This is because the purpose of the analysis is to understand the dynamics of the blood supply from the perspective of the entire population. If the goal were to analyze individual donation frequencies, gender would be an important variable. However, since we are examining data for the whole population, there is no need to separate by gender. Additionally, when the number of male and female donors is normalized for mean and variance, they show very similar patterns. For these reasons, we analyzed the data without dividing it by gender.
Next, let's examine how the distribution changes as we divide the blood donor data into groups. Our goal is for the data to follow a normal distribution. Since a normal distribution indicates that no unexplained factors remain in the data.
First, let's look at the distribution of the number of blood donors without dividing it into any groups. The distribution shows a bimodal pattern, which indicates that there are still many unexplained factors in the data. Now, let's add the day-of-the-week effect that we discovered earlier to the model and see how the distribution changes. As seen in Figure 5, the distribution of weekday data after removing the day-of-the-week effect is no longer bimodal and has shifted to resemble a bell shape.
The distribution of the data after removing the day-of-the-week effect takes on a bell shape, but the long tail extending to the left is still concerning. We suspected this was due to a concentration of blood donations occurring on days when most donor centers are closed, and we incorporated this into the model. When we plotted the distribution using only data from non-holiday days, like how we removed the day-of-the-week effect, the tail disappeared.
Annual Seasonality
So far, we have identified day of the week and holidays as factors that influence the number of blood donors. Let's express this in a regression equation and check the residuals. If the residuals do not follow a normal distribution, it means there are still unexplained factors affecting the number of blood donors. The regression equation for the number of blood donors based on day of the week and holidays is shown below.
This equation means that the response variable represents the number of whole blood donors, combining both 320ml and 400ml blood donations. The explanatory variables are the day of the week and holidays, which have been included in the equation in the form of dummy variables.
The residuals after removing the day-of-the-week and holiday effects no longer show the unusual patterns from the original data, such as the bimodal shape or long tail. However, when looking at the right side of the mean, there is an unusual pattern that wasn't detected in the distribution of blood donors. This suggests that there are still factors not explained by the day-of-the-week and holiday variables. What could those factors be?
There are two types of seasonality: weekly seasonality, such as day-of-the-week effects, and annual seasonality, like spring, summer, fall, and winter. Since we've already accounted for weekly seasonality, let's now consider annual seasonality. As mentioned earlier, we know that the number of blood donors tends to decrease in winter, so we can expect that annual seasonality exists. Let's examine the data to confirm this.
Looking at Figure 8, we can see that the distribution of blood donors varies by month. Therefore, it is reasonable to conclude that annual seasonality exists in the number of blood donors, and we should incorporate this into the model. It is suspected that annual seasonality may be contributing to the unusual patterns in the residuals.
How can we incorporate annual seasonality into the model? The simplest method would be to include all days of the year using 365 dummy variables. However, this approach is inefficient as it uses too many variables. When there are too many variables, the model's variance increases, and multicollinearity issues may arise. This is especially concerning because the number of blood donors does not fluctuate dramatically on a daily basis, so multicollinearity is likely. So, how can we capture similar information without using 365 dummy variables?
Let's focus on the word “cycle”. When we think of cycles, sine and cosine functions come to mind. How about using sine and cosine functions to capture annual seasonality? This approach is called Harmonic Regression.
Figure 9 illustrates that annual seasonality is captured using appropriate sine and cosine functions. By using a method suited to the characteristics of the cycle, we were able to capture seasonality with a small number of variables. Of course, using temperature to capture annual seasonality is another option. This method has the advantage of being more intuitive and easier to control variables. However, there is annual seasonality in the blood donor data that cannot be fully explained by temperature alone, which is why harmonic regression was used to model the seasonality.
As a result of incorporating annual seasonality into the model, the unusual patterns in the residuals were eliminated. The regression equation with annual seasonality included is shown below.
Do temperature and weather affect the number of blood donors? Upon investigating the data, we found that 70% of donors visit blood donation centers in person. This leads to a strong suspicion that temperature and precipitation, which influence outdoor activities, could have a significant impact on the number of blood donors.
Since weather conditions vary significantly by region, we conducted the analysis separately for each region. We examined the significance of temperature and precipitation variables for individual regions. The results showed that precipitation negatively impacted the number of blood donors in all regions, while temperature did not have a significant effect. This is because the information provided by temperature was already captured when we incorporated annual seasonality into the model. The regression equation, including precipitation, is shown below.
Dynamics of Blood Supply and Usage During the COVID-19 Period
In this section, we will examine how blood stock responds when a significant external shock occurs. Specifically, we will analyze the dynamics of blood stock during the COVID-19 period, which was the most significant recent shock.
It is likely that maintaining blood stock above a certain level was challenging during the COVID-19 period. This is because population movement significantly decreased due to various quarantine measures and fears of infection. Moreover, as shown in Figure 12, the number of individuals ineligible for blood donation increased starting in 2020. This was due to the introduction of new health criteria during the COVID-19 period, which restricted blood donations for a certain period after recovering from COVID-19 or receiving a vaccine. For these reasons, we expect that blood stock levels decreased significantly during the pandemic. Let’s examine the data to see if our hypothesis is correct.
As seen in Figure 13, interestingly, blood stock levels were maintained above a certain level during the COVID-19 period. The blood stock never dropped below two days' supply. How was the Korean Red Cross able to maintain blood stock above a certain level despite the external shock of the pandemic?
After controlling for the factors considered earlier and conducting a regression analysis, it was found that blood usage decreased by 4.25% during the COVID-19 pandemic. This reduction can be attributed to two factors: the intentional decrease in blood usage to maintain stock levels, and the natural decline due to the shortage of medical personnel and hospital wards during the pandemic.
A regression analysis on blood supply using the same variables showed a 5.3% decrease in supply. The reason blood stock levels were maintained during the COVID-19 period is that both usage and supply decreased at similar rates. However, considering the broader societal impact of the pandemic, the 5.3% decrease is relatively minimal.
Finding of the "Blood Shortage" Variable
A regression analysis of blood donor numbers by region showed that, in certain areas, the number of donors increased. Since COVID-19 did not occur only in specific regions, this contradicts common sense. Therefore, it is suspected that some factor during the pandemic may have contributed to an increase in blood supply in those areas. Additionally, the 5.3% decrease in the number of donors is likely offset by this increase factor.
We anticipated that an increase factor might come into play during periods of blood shortage. Thus, we created a proxy variable called "Blood Shortage". Days when blood stock dropped below a certain level, along with a defined period thereafter, were classified as "shortage periods". This reflects the impact of specific measures taken by the Korean Red Cross during these periods.
An analysis of the effect of the "blood shortage" on the number of blood donors showed that, in most regions, it had a positive effect on donor numbers. This supports the earlier hypothesis that some factor was increasing blood supply. Similarly, when examining the effect of the shortage condition on blood usage, we observed a decrease in usage during those periods. This indicates that the manual for blood supply shortages, which is triggered when blood stock levels fall below a certain threshold, worked effectively.
However, the increase factor associated with the "blood shortage" is likely only effective when the decrease in blood donors can be anticipated in advance. This is because the Korean Red Cross needs to predict a decline in donor numbers to respond through promotion efforts. Let’s verify this looking at the data.
Looking at the model’s residuals, we can see that during the early stages of the COVID-19 pandemic in Daegu/Gyeongbuk and the Omicron wave—both unexpected events—the number of blood donors decreased. In other more predictable periods, donor numbers did not continue to decline, suggesting that the increase factor operated effectively. The reason blood stock levels were maintained during those times is that the mannual for blood supply shortage was activated, and the public became more aware of the shortage, leading to more proactive blood donations, which helped increase supply.
Measuring the Effect of Promotions
The Effect of the Additional Giveaway Promotion
During the COVID-19 period, the Korean Red Cross employed various methods to prevent a decline in the number of blood donors, including promotions, SMS donation appeals, and public service advertisements. Which of these methods was the most effective? If the effect can be accurately measured, the Korean Red Cross will be able to respond more efficiently to future blood shortages.
It would be ideal to measure the effect of all methods, but most were difficult to analyze due to a lack of data or one-time events. Fortunately, promotions were deemed suitable for quantitative analysis, so we focused on measuring their impact. Let’s examine how much promotions increased the number of blood donors.
The giveaway promotion was conducted in the same way across all regions for an extended period, so there should be no major issues in measuring its effect. To assess its impact, we created a dummy variable for "promotion days" while controlling for the variables we previously identified. The results showed that the response to the promotion varied by gender. Men responded strongly to the promotion while women did not show a significant response. However, does simply adding a dummy variable truly capture the pure increase driven by the promotion?
Using a simple dummy variable to capture the effect of the promotion period results in a mixture of both the "promotion effect" and the "trend during the promotion period". For example, the number of blood donors in May and December differs. May sees more donors due to favorable weather, while December sees fewer. Therefore, simply adding a dummy variable makes it difficult to isolate the pure effect of the promotion, as the existing higher donor numbers in May may get mixed with the increase from the promotion itself. We need to consider how to separate these effects to accurately measure the promotion's impact.
As shown in Figure 18, the giveaway promotion was conducted on a quarterly basis. Since each quarter shares similar seasonality, there is likely no significant change in the number of blood donors across quarters. To remove trends, the entire timeline was divided into quarters, and the pure promotion’s impact was measured.
After removing the trends, there is no significant difference in the promotion response between the male and female groups. Although there is some variance due to unexplained social factors, the average response is similar, leading to more accurate results compared to using a simple dummy variable.
The Effect of Special Promotions
In addition to the giveaway promotion, the Korean Red Cross conducted various special promotions, including gift cards, souvenirs, travel vouchers, and sports event tickets. To accurately measure the effect of these special promotions, it is essential to remove the trends, just as with the giveaway promotion. In other words, we need to identify periods where there would be no differences except for the promotion. In this analysis, we examined the difference in the number of blood donors two weeks before and after the promotion period, as well as during the promotion period itself.
The increase rate in the number of blood donors by special promotions showed positive results in many regions. Among these, the offering sports viewing tickets was particularly effective. Therefore, it is suggested to use sports viewing tickets as a means to effectively increase the number of blood donors during anticipated periods of blood shortage.
Episode for Data Collection
Here, I will conclude the analysis by sharing an episode from the data collection. The data used in this research was collected through various channels. For data related to blood services statistics, I was able to obtain well-organized information through the Statistics Korea API. However, other data sources were not as easily accessible, which was somewhat disappointing. While blood stock, usage, and supply data are available through other APIs, they only provide monthly data, which lacks the resolution needed for detailed analysis.
Fortunately, since the Korean Red Cross is a government organization, we were able to request daily data on blood stock, usage, and supply, as well as data on the giveaway promotion through a "Public Information Request". Government departments or public institutions often provide access to such data, excluding sensitive personal information. I encourage other researchers to actively use information disclosure requests to obtain high-quality data. Especially, in South Korea, where the digitization of administrative data is well-developed, researchers can access the materials they need for their studies.