The process of turning web novels into webtoons and data science

Web novel to Webtoon conversion is not only based on 'profitability'
If the novel author is endowed with money or bargaining power, 'Webtoonization' may be nothing more than a marketting tool for the web novel.
Data science modeling based on market variables unable to grab such cases

A student in SIAI’s MBA AI/BigData progam, struggling with her thesis, chose her topic as the condition for turning a web novel into a webtoon. In general, people would simply think that if the number of views is high and the sales volume of the web novel is large, a follow-on contract with a webtoon studio will be much easier. She brought in a few reference data science papers, but they only looked into publicly available information. What if the conversion was the choice of the web novel author? What if the author just wanted to spend more marketing budget by adding webtoon in his line-up?

Literature mostly runs hierarchical structures during ‘deep learning’ and use ‘SVM’, a task that simply relies on computer calculations, and calculate the number of all cases provided by the Python library. Sorry to put it this way, but such calculations are nothing more than a waste of computer resources. It has also been pointed out that the crude reports of such researchers are still registered as academic papers.

Put all crawled data into ‘AI’, then it will swing a majic wand?

Converting a web novel into a webtoon can be seen as changing a written story book into an illustrated story book. Professor Daeyoung Lee, Dean of the Graduate School of Arts at Chung-Ang University, explained that the change to OTT is a change to video story books.

The reason this transition is not easy is because the transition costs are high. Domestic webtoon studios have a team of designers ranging from as few as 5 to as many as dozens of designers, and the market has been differentiated considerably into a market where even a small character image or pattern that seems simple to our eyes must be purchased and used. After paying all the labor costs and purchasing costs for characters, patterns, etc., it still takes $$$ to turn a web novel into a webtoon.

This is probably the mindset of typical ‘business experts’ to think that manpower and funds will be concentrated on web novels that seem to have a high possibility of success as webtoons, as investment money is invested and new commercialization challenges are required.

However, the market does not operate solely on the logic of capital, and ‘plans’ based on the logic of capital are often wrong due to failing to read the market properly. In other words, even if you create a model by collecting data such as the number of views, comments, and purchases provided by platforms and consider the possibility of webtoonization and the success of the webtoon, it is unlikely that it will actually be correct.

One thing to point out here is that although there are many errors due to market uncertainty, there are also a significant number of errors due to model inaccuracy.

Wrong data, wrong model

For those who simply think that ‘deep learning’ or ‘artificial intelligence’ will take care of it, creating a model incorrectly means using a less suitable algorithm when one of the ‘deep learning’ algorithms is said to be a better fit, or worse. It will result in the understanding that good artificial intelligence should be used, but less good artificial intelligence is used.

However, which ‘deep learning’ or ‘artificial intelligence’ is a good fit and which one is not a good fit is a matter of lower priority. What is really important is how accurately you can capture the market structure hidden in the data, so you must be able to verify whether it fits well not only by chance in the data selected today, but also consistently fits well in the data selected in the future. Unfortunately, we have already seen for a long time that most ‘artificial intelligence’-related papers published in Korea intentionally select and compare data from well-matched time points, and professors’ research capabilities are judged simply by the number of K-SCI papers, and the papers are compared. We cannot help but point out that proper verification is not carried out due to the Ministry of Education’s crude regulations regarding which academic journals that appear frequently are good journals.

The calculation known as ‘deep learning’ is simply one of the graph models that finds nonlinear patterns in a more computationally dependent manner. In natural language that must be used according to grammar, computer games that must be operated according to rules, etc., there may be no major problems in use because the probability of errors in the data itself is close to 0%, but the above webtoonization process is not expected to respond in the market. There may be problems that are not resolved, and the decision-making process for webtoons is likely to be quite different from what an outsider would see.

Simply put, it can be pointed out that the barriers given to writers who already have a successful ‘track record’ are completely different from the barriers given to new writers. Kang Full, a writer who recently achieved great success with ‘Moving’, explained in an interview that he started with the intellectual property rights of webtoons from the beginning, and that he made major decisions during the transition to OTT. This is a situation that ordinary web novel and webtoon writers cannot even imagine. This is because most web novel and webtoon platforms can sell their content on the platform through contracts that retain intellectual property rights for secondary works.

How much of it is possible for an author to decide whether to make a webtoon or an OTT, reflecting his or her own will? If this proportion increases, what conclusion will the ‘deep learning’ model above produce?

The general public’s way of thinking does not include cases where webtoons and OTT adaptations are carried out at the author’s will. The ‘artificial intelligence’ models mentioned above will only explain what percentage of the ‘logic of capital’ that operates inside the web novel and webtoon platform is correct. However, as soon as the proportion of ‘author’s will’ instead of ‘logic of capital’ is reflected increases, that model will judge the effects of variables we expected to be much lower, and conversely, it will appear as if the effects of unexpected variables are higher. In reality, it was simply because we failed to include an important variable called ‘author’s will’ that should have been reflected in the model, but since we did not even consider that part, we only ended up with an absurd story with an absurd title of ‘Webtoonization process informed by artificial intelligence’.

Before data collection, understand the market first

It has now been two months since the student brought that model. For the past two months, I have been asking her to properly understand the market situation to find the missing pieces in the webtoonization process.

From my experience with business, I have seen that even though the company thought that it could take on an interesting challenge with enough data, it could not proceed due to the lack of the ‘Chairman’s will’. On the other hand, companies that were completely unprepared or did not even have the necessary manpower said, ‘This is the story you heard from the Chairman.’ I’ve seen countless times where they come up with absurd project ideas saying they’re going to proceed ‘as usual’, and then only IT developers are hired without data science experts, and the work of copying open libraries from overseas markets is repeated.

Considering the amount of capital and market conditions that are also required for the webtoonization process, it is highly likely that a significant number of webtoons will be included in web novel writers’ new work contracts in the form of a ‘bundle’, which is naturally included to attract already successful web novel writers, and generate profits. In the case of writers who want to dominate the webtoon studio, they are likely to sign a contract with the webtoon platform by signing a contract with the webtoon studio themselves and starting to serialize the webtoon after the first 100 or 300 episodes of the web novel are released. From the perspective of a web novel writer who has already experienced that profits increase due to the additional promotion of the web novel as the webtoon is developed, there are cases where the webtoon product is viewed as one of the promotional strategies to sell their intellectual property (IP) at a higher price. It happens.

To the general public, this ‘author’s will’ may seem like an exception, but even if the above proportion of web novels converted to webtoons exceeds 30%, it becomes impossible to explain webtoons using data collected through general thinking. In a situation where there are already various market factors that make it difficult to increase accuracy, and in a situation where more than 30% is driven by other variables such as ‘the author’s will’ rather than ‘market logic’, how can data collected through general thinking lead to a meaningful explanation? Can I?

Data science is not about learning ‘deep learning’ but about building an appropriate model

In the end, it comes back to the point I always give to students. It is pointed out that ‘we must understand reality and find a model that fits that reality.’ In plain English, the expression changes to the need to find a model that fits the ‘Data Generating Process (DGP)’, but the explanatory model related to webtoonization above is a model that does not currently take ‘DGP into consideration’ at all. If scholars are in a situation where they are listening to the same presentation, complaints such as ‘Who on earth selected the presenters’ may arise, and there will be many cases where they will just leave even if they are criticized for being rude. This is because such an announcement itself is already disrespectful to the attendees.

In the above situation, in order to create a model that can be considered for DGP, you must have a lot of background knowledge about the web novel and webtoon markets. It does not reflect factors such as how web novel writers on major platforms communicate with platform managers, what the market relationship between writers and platforms is like, and to what extent and how the government intervenes, and simply inserts materials scraped from the Internet. There is no point in simply doing the work of ‘putting data into’ the models that appear in ‘artificial intelligence’ textbooks. If an understanding of the market can be derived from that data, it would be an attractive data work, but as I keep saying, if the data is not in the form of natural language that follows grammar or a game that follows rules, it will only be a waste of computer resources with no meaning. It’s just that.

I don’t know whether that student will be able to do some market research to destroy my counterargument at the meeting next month, or whether he will change the detailed structure of the model based on his understanding of the market, or worse, whether he will change the topic. What is certain is that a ‘paper’ with the name ‘data’ as a simple way to put the collected data into a coding library will end up being nothing more than a ‘mixed-up code’ containing only one’s own delusions and a ‘novel filled with text only’.

Keith Lee

Head of GIAI Research Head of GIAI Korea co-Founder @ Swiss Institute of Artificial Intelligence