NJ2018-2 Lyndsey

classics
Lyndsey (토론 | 기여) 사용자의 2018년 12월 23일 (일) 09:49 판 (On Translating Nongsa jikseol into Data)

이동: 둘러보기, 검색

For this final project, I investigated how to organized the various processes and steps of Nongsa jikseol. This was challenging because the steps of one plant variety are dependent on the steps of other varieties, some varieties share some of the same processes but not others, some steps are conditional on certain events such as a particular date on the calendar or the growth of a plant to a certain height, etc.

I was unable to add all the data for all of the crop varieties, but I have selected a few of the more tricky and complex cases to demonstrate how these can best be organized.

Below, first I list all of the various cultivation methods contained within the text. Then, I have included my suggested ontology (for just the elements relating to step order and timing - not tools, ways-of-doing the steps, rationales for the steps, or cautions) with some examples. Then I will explain the ontology and the reasons why I chose to structure it the ways I did.In closing, I will discuss the work left to be done in the future, and my general comments on the process of translating Nongsa jikseol into data.

Cultivation Methods

Method Crop Varieties Field Type Explanation
Hemp A Hemp
Hemp B Hemp (late-sowing)
Rice A Rice (early-sowing) Dry field
Rice B Rice (late-sowing) Wet field
Rice C Rice (late-sowing) Dry field
Rice D Rice (late-sowing) Wet fields
Rice E Rice (upland)
Millet A Proso millet (early-sowing)
Millet B Foxtail millet (early-sowing)
Millet C Proso millet (late-sowing)
Millet D Foxtail millet (late-sowing)
Millet E Sorguhum
Barnyard Grass A Barnyard grass
Barnyard Grass B Barnyard grass (late-sowing quick-ripening) Double-crop with barley
Beans A Soy beans (early-sowing)
Beans B Soy beans (late-sowing) Relay cropping
Beans C Soy beans (late-sowing) Small field Relay cropping
Beans D Red beans (early-sowing)
Beans E Red beans (late-sowing) Relay cropping
Beans F Red beans (late-sowing) Small field Relay cropping
Beans G Mung beans
Barley/Wheat A Barley, wheat Barley fields, wheat fields
Barley/Wheat B Barley, wheat Proso millet fields, foxtail millet fields, soy bean fields, red bean fields, buckwheat fields Relay cropping
Barley/Wheat C Barley, wheat Barren fields
Barley/Wheat D Spring barley
Sesame A Sesame
Sesame B Sesame Barley fields Relay cropping
Sesame C Sesame Alt - Planting sesame or mung beans first
Sesame C Perilla
Buckwheat A Buckwheat
Buckwheat B Buckwheat Fertile soil from forests Slash-and-burn
Seed Prep A All Post-harvest seed prep
Seed Prep B Buckwheat, millet (if barren), upland rice (if barren), rice (late-sowing, dry) Applying urinated ash
Seed Prep C Rice (early-sowing, dry), rice (transplanting) Water soaking
Soil Prep A Dry fields
Soil Prep B Poor fields
Soil Prep C Uncultivated land
  • Total = 37

Ontology

Classes

Type Class Description
Step Soil Preparation Field selection, plowing, tilling, etc.
Step Soil Fertilization Fertilization of soil
Step Seed Preparation Preparing the seeds for planting, such as rinsing them in water, covering them with fertilizer, etc.
Step Sowing Putting the seeds into the soil and manipulating the soil so that the seeds can be covered, etc.
Step Weeding Removing weeds from the soil by means of hand, hoe, ox, etc.
Step Harvesting Cutting or picking the ripe crops
Time Static Time Timings based on the sun and moon (i.e. calendar time)
Time Dependent Time Timings based on the things which have flexible or unpredictable lengths or occurrences, such as "when the plant grows three inches" or "when the snow melts"
Condition Condition Conditions under which a step is or is not performed that are not related to time
Cultivation Crop Variety A crop variety
Cultivation Cultivation Method Sometimes the same crop has an alternative method (such as relay cropping), which can be denoted with this crop method class
Cultivation Crop Field The kind of field the crop is being planted in, i.e. what was formerly planted in the field, or dry/wet field (not field condition (barren, fertile, etc.)!)
  • Other classes include tools, ways of doing (heavily, lightly, etc.), rationale for a step, and cautions for a step, however these were outside of my scope of this investigation which looked specifically at timing and performance of steps.

Relations

Relation Domain Range Description
hasStep crop method step Way to group the steps for a particular variety's normal method or special method
hasVariety crop method crop variety Way to group the steps for a particular variety's normal method or special method
hasField crop method crop field Way to group the steps for a particular variety's normal method or special method
then step step Simple ordering of steps
doAfterThreeDays step step Still a simple order but with a specific amount of time in between
doImmediately step step Still a simple order but with a specific amount of time in between
doAtNight step step Still a simple order but with a specific amount of time in between
doConcurrently step step, condition When two processes happen at the same time, or when the step happens throughout the period of a particular condition
doWhen step time Do the step when the time event occurs
doBefore step time Do the step before the time event occurs
doAfter step time Do the step after the time event occurs
idealTime step time When there are multiple possible timings, this timing is best
okayTime step time When there are multiple possible timings, this timing is acceptable
worstTime step time When there are multiple possible timings, this timingis worst
doIf step, crop condition Only do the step in this case
doNotDoIf step, crop condition Do not do the step in this case
  • This is not a comprehensive list as I have not been able to look at all of the crops in detail. It is an example of the kinds of time- and condition-related relations there could be.


Example Graphs

Explanation

The basic possible relations of a given step
When there are two options for a given step with different timings or conditions

There are largely four sets of relation types:

  1. Relations between steps and steps (order)
  2. Relations between steps and the cultivation method of which they are a part
  3. Relations between the cultivation method and the crop varieties and field types for which they apply
  4. Relations between steps and timings
  5. Relations between steps and conditions for when to do/not do the steps

Excluded from this list are tools, cautions, rationales, and ways-of-doing.

Through drawing the graphs, I realized that often times, one variety will have multiple cultivation methods which often shared certain steps. Therefore, I realized I could not group all the steps by just the crop variety, or there would be no way to know which order of steps applies to which cultivation method. Therefore, I decided to connect each step to a cultivation method, and then connect this cultivation method to a variety(ies) and field type (as in relay cropping in the root-fields of other crops). This is shown as an example in NJ2018-2_L_barleywheat4.lst.

For example, you can seen in the barley/wheat examples (NJ2018-2_L_barleywheat1.lst, NJ2018-2_L_barleywheat2.lst) that the first parts of the cultivation methods are different, but they share the parts of weeding and harvesting. Therefore, this reduces the number of nodes while still being able to query a particular cultivation method's entire process in order.

However, you must be careful to not have two cultivation methods share the same node if the timing or condition is different. For example, both methods involve sowing by "scattering seeds" but if one is done "in the 5th month", and the other "is done after burning the foliage of plants into ashes," these steps should be distinguished as separate nodes. Otherwise, we cannot know which condition or timing applies to which cultivation method.

There was also the issue of relay cropping, in which the harvesting process of one crop is part of the soil fertilization step of another crop. In this case, it was realized that for relay crops there would be two "harvesting" steps - one the harvesting of the previous crop, and the other the harvesting of the current crop. The two different harvest steps would each have unique IDs, but how would the computer know which harvest is of crop? The computer only knows the harvest step is a part of two methods? I wondered whether the individual steps also needed variety labels - but that just makes the overall structure more confusing. I realized that which crop was being harvested in a step could be deciphered indirectly by the direction of the "then" arrows. If a harvesting step comes after a weeding step of a cultivation method, we know what is being harvested is the crop of that cultivation method. Conversely, if the harvesting comes before soil prep or sowing steps of a method, we can know it is the harvest of the crop in the other cultivation method the step is related to.

Regarding what to do about things like "wait three days" to do the next step, since the order of the steps is still just "A then B" I kept this as a relation rather than making it another node. If the timing is related not to a step, but to a solar/lunar time or a time that is uncontrollable to the farmer, then this was added as a node.

Also, through creating the sample graphs, I realized that the relation "or" as in, "For this step, do A or do B," should not be used. The reason is as follows. For example, in the case of barley/wheat, it says "plow in the 5th or 6th months." We may think to make make two relations, "plow--doWhen--5th month" and "plow--doWhen--6th month," and then put an "or" relations between the months, like "5th month---or---6th month." However, we do this, then for all other crops that use the 5th or 6th month, there will be an "or" relation in between the months. For example, if crop B says "plow in the 5th month" and "sow in the 6th month", but this "or" relation is between them, it could look like "plow or sow," which is not the intended meaning. Therefore, regarding time there are two options. One, create two steps, such as "plow and plow" and give each a different time, and then link them both to the previous and following steps, creating a kind of diamond shape. This can be seen in the case of NJ2018-2_L_barleywheat1.lst, where there are three different possible sowing times with different conditions. Or, in the specific case of "plow in the 5th or 6th month," this can be changed to "--doAfter--4th month" and "--doBefore--7th month" to denote the entire time span of the 5th and 6th months (which is what I did in the example above).

Future Work

  • After all the data for each crop method is added, it would have to be consolidated to cut out the redundant steps (but make sure to keep the steps distinct if there are different conditions/timings for a similar step).
  • One of my goals was to create a kind of calendar of all the steps to understand the lives of farmers throughout the year. However, 1) I was not able to create the data for all the methods, and 2) I think I would have to put the data into Neo4J and use a very specific query in order to not make the network graph look like a jumbled mess.
  • There are still some unclear aspects of the original text which need to be clarified. For example, in the section on barley and wheat, it says "薄田倍加布草, 如未及刈草, 用糞, 又如大小豆法。" This was translated into Korean as "척박한 밭에서는 풀을 두 배로 더 까는데, 만약 풀을 베지 못했다면 거름을 사용하며, 또한 대두, 소두의 재배법과 동일하다," and in English as "For barren fields, spread twice the amount of weeds, and if the weeds could not be cut, use manure, or also, the method used for soybeans and red beans." The English translation suggests "if there is not enough weeds, use manure, OR use the method of beans," when really it means that this aforementioned fertilization step is "also the same as that of beans." However, when we look at the bean land fertilization method, it just says, "田若塉薄, 用糞灰, 宜小不宜多。" So, is using "用糞灰" for "薄田" the part "又如大小豆法" that is referring to? Manure is also used for hemp, rice, millet and more; Sometime just as a part of the regular process and other times just in the case of barren land; sometimes the bunhoe is applied to the seeds directly before they are sowed, and other times it seems to be just put on the soil. This part in particular needs to be studied more carefully before the relationships between the cultivation methods of different crops can be fully understood.

On Translating Nongsa jikseol into Data

One of the most important reasons to translate the Nongsa jikseol into data is that though the process of making the information contained within the text computer-readable, the meanings become much clearer in a way that merely translating the Hanmun into Korean or English cannot accomplish. Furthermore, as the Nongsa jikseol is meant to be concise, different varieties and methods are grouped together. Through translating the text into data, these groupings need to be un-grouped and omissions need to be re-created in order for the complete structure to work. This process of translating the information contained in the Hanmun into data is much more time consuming than merely translating it into Korean or English, and requires an overall understanding of the structure of the information contained within the text. Yet, by doing this work, it becomes easier to see the similarities and differences between cultivation methods.