Xinxin Yang's avatar
Xinxin Yang committed
# fadntuils
The fadnUtils package facilitates the efficient handling of FADN data within the R language framework. Furthermore, the package is targeted for use within the JRC D.4 context. This means that there is a specific temporal pattern of how a user interacts with the package (see Figure plot).
![plot](inst/examples/pic/workflow.png)
Xinxin Yang's avatar
Xinxin Yang committed

More specifically, after a request for FADN data from DG-AGRI, this data is delivered to JRC D.4 in csv format. 

# Installation 
`fadnUtils` and Related R packages can be installed.

```{r results='hide', message=FALSE, warning=FALSE}
requiredPackages = c('fadnUtils','data.table', 'devtools','jsonlite', 'ggplot2')
for(p in requiredPackages){
  if(!require(p,character.only = TRUE)) install.packages(p)
  library(p,character.only = TRUE)
}
```

# Usage in Brief
After loaded the packages, you will have a functinal R package on your computer. Then, we will talk about using your package online.

1. Create a working directory
    - a user-defined data directory
1. Import CSV FADN data
    - convert the csv data into raw r-data
    - convert raw r-data into str r-data
1. Load r-data and structured r-data
1.  Perform analysis

## 1. Create a working directory
Frist, User sets a working directory. Make sure the relative path stays within `CurrentProjectDirectory`. 
```{r}
# using a local directory
CurrentProjectDirectory = "D:/public/yang/MIND_STEP/New_test_fadnUtils"
create.data.dir(folder.path = CurrentProjectDirectory)
set.data.dir(CurrentProjectDirectory)
get.data.dir()
```
### Required files
We request FADN data from DG-AGRI, which is delivered to us in csv format. In order to work efficiently with R, we should convert the csv-data to an r friendly format, this step is done with help of a human-readable file, called `raw_str_map.file`. Both files are necessary.

1. FADN data in csv format: the data for loading
2. A json file for extracting the variables

### Folder Structure 
A working directory is specified arbitrarily by the user. This structure helps data management and maintenance. The directory looks like this:

```base
CurrentProjectDirectory/
+-- csv
+-- fadnUtils.metadata.json
+-- rds
\-- spool
    \-- readme.txt
```
* csv: CSV files are stored here
* fadnUtils.metadata.json: containing the mapping from the fadn.raw.rds to the fadn.str.rds data
* rds: placing r-data in the "rds" directory
* spool: keeping related files

## 2. Import CSV FADN data
First, we will import the data into an R-friendly format using the fadnUtils package.

### Convert the csv data into raw r-data
The raw data will be added to a `rds` directory. We use a convenient function from this package to convert the csv file into raw r-data.

```{r}
Xinxin Yang's avatar
Xinxin Yang committed
fadn.data.dir <- "D:/public/data/fadn/lieferung_20210414/csv/"
Xinxin Yang's avatar
Xinxin Yang committed
# load data for country BEL and year 2009
Xinxin Yang's avatar
Xinxin Yang committed
convert.to.fadn.raw.rds(
Xinxin Yang's avatar
Xinxin Yang committed
      file.path =  paste0(fadn.data.dir, "BEL2009.csv"),
Xinxin Yang's avatar
Xinxin Yang committed
      sepS = ",",
Xinxin Yang's avatar
Xinxin Yang committed
      fadn.country = "BEL",
Xinxin Yang's avatar
Xinxin Yang committed
      fadn.year = 2009
      #keep.csv = T # copy csv file in csv.dir
Xinxin Yang's avatar
Xinxin Yang committed
      )
Xinxin Yang's avatar
Xinxin Yang committed
```
At any time, we can check for the current data dir, what csv files (countries, year) are loaded.
```{r}
show.data.dir.contents()
```

### Convert raw r-data into structured r-data
Then, We convert raw data into structured data. Broadly, there are 3 steps to including data in an R package: 

1. setting a structured data in the `structured` directory, 
2. checking the `raw_str_map.file` that all variables can be converted.
3. converting the structured data successfully into `structured` directory.

#### Set a `structured` directory for saving the structured data
We set a `test` folder to placing the structured data.

```{r}
rds.dir = paste0(get.data.dir(),"/rds/")
# set a structured name for for saving the structured r-data in rds.dir
new.str.name = "test"
# set a extraction_dir
dir.create(paste0(rds.dir, new.str.name))
new.extraction.dir = paste0(rds.dir, new.str.name)
```

#### Check the variables in the `raw_str_map.file`
 Before conversion it is recommended to use `check.column()` method, ensuring that all variables in the`raw_str_map.file` can be converted.
```{r results='hide', message=FALSE, warning=FALSE}
Xinxin Yang's avatar
Xinxin Yang committed
list_vars = check.column(
              # a rds file or a csv file
Xinxin Yang's avatar
Xinxin Yang committed
              importfilepath = paste0(rds.dir, "fadn.raw.2009.BEL.rds"),
Xinxin Yang's avatar
Xinxin Yang committed
              # a json file
              jsonfile = "D:/public/yang/MIND_STEP/2014_after_copy.json",
              # write a new json file without unmatched variables
              rewrite_json = T,
              # save the new json in extraction_dir
              extraction_dir = new.extraction.dir)
Xinxin Yang's avatar
Xinxin Yang committed
```


#### Convert the raw data into structured r-data using the checked json file
Finally, We can convert a raw r-data to str r-data using a external json file. For more details on converting in fadnUtils packages, `see USE_CASE.R`.
```{r}
Xinxin Yang's avatar
Xinxin Yang committed
convert.to.fadn.str.rds(fadn.country = "BEL",
                        fadn.year = 2009,
                        str.name = new.str.name # extraction_dir
)

convert.to.fadn.str.rds(fadn.country = "BEL",
                        fadn.year = 2009,
                        raw_str_map.file = "D:/public/yang/MIND_STEP/new_sample/test01/raw_str_map.json", # a external json file
                        str.name = new.str.name, # extraction_dir
                        force_external_raw_str_map = T,
                        DEBUG = F
                        )
Xinxin Yang's avatar
Xinxin Yang committed
```
#### Files Structure in `rds` folder
After conversion, we can see the `rds` folder:

* `fadn.raw.2009.BEL.rds`: raw r-data for country "BEL" and year "2009"
* `test`: extraction_dir for saving the structured r-data and extracting json file
* `fadn.str.2009.BEL.rds`: structured s-data for for country of "BEL" and year of "2009"
* `raw_str_map.json`: default json file
* `rewrite_2014_after_copy.json`: modified json file after checking the variables

```base
rds
+-- fadn.raw.2009.BEL.compressed.rds
+-- fadn.raw.2009.BEL.rds
+-- fadn.raw.2010.BEL.compressed.rds
+-- fadn.raw.2010.BEL.rds
+-- fadn.raw.2011.BEL.compressed.rds
+-- fadn.raw.2011.BEL.rds
+-- fadn.raw.2012.BEL.compressed.rds
+-- fadn.raw.2012.BEL.rds
\-- test
     +-- fadn.str.2009.BEL.rds
     +-- raw_str_map.json
     \-- rewrite_2014_after_copy.json
```

## 3. Load raw r-data and structured r-data
In order to initiate any analysis with `fadnUtils`, we first need to load r-data. We can only load data for countries and years that that has already been imported into a data.dir folder.

### Load raw r-data for the country `BEL` and year `2009`
```{r results='hide', message=FALSE, warning=FALSE}
my.data.2009.raw = load.fadn.raw.rds(
  countries = "BEL",
  years = 2009
)
```
### Load structured data for the country `BEL` and year `2009`
We can load structured from country `BEL` and year `2009`.
```{r results='hide', message=FALSE, warning=FALSE}
my.data.2009.str = load.fadn.str.rds(
  countries = "BEL",
  years = 2009,
  extraction_dir = "test" # Location of the str r-data
)
```
### Load structured data from all available countries and years.
The following is an example of loading structured data all available countries and years. 

```{r results='hide', message=FALSE, warning=FALSE}
my.str.data = load.fadn.str.rds( extraction_dir = "a")
```

## 4. Perform analysis
Here are some examples to perform data. 

### Collection the common id 
We can collect the common id from the loaded r-data using `collect.common.id()` function on `fadnUtils`.

```{r, message=FALSE}
# Collection the common id from loaded structured r-data
collected.common.id_str = collect.common.id(my.str.data)
```
### Plotting
To build a basic plot, we will use the `ggplot` function using the plotting package 
`ggplot2`.

```{r results='hide', message=FALSE, warning=FALSE}
crops.data = my.str.data$crops #catering for easier access at next steps

#this contains the number of crops for each farm-country-year/