% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/utilities.R
\name{update_elements.DT}
\alias{update_elements.DT}
\title{Updates selected elements of data stored in one DT with new one given in melted format}
\usage{
update_elements.DT(data.old, data.new)
}
\arguments{
\item{data.old}{The DT to update}
\item{data.new}{The data to insert. It must have three columns: {id,variable,new value}. E.g. data.new=data.table("id"=c(810001100105),"variable"=c("AASBIO_CV"),value=c(999999))}
}
\value{
a DT with the updated values
}
\description{
The user provides the data.new: {id,variable,new value}. The function overwrites all existing id-column with the new values
}
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/utilities.R
\name{write.excel}
\alias{write.excel}
\title{Utility to copy data to clipboard for pasting to Excel}
\usage{
write.excel(d, getRownames = F, ...)
}
\arguments{
\item{d}{the data to copy}
\item{getRownames}{set to T to opy also row.names}
\item{...}{any other parameter for passing to write.table}
}
\value{
nothing
}
\description{
Utility to copy data to clipboard for pasting to Excel
}
\examples{
write.excel(d);
}
version.txt 0 → 100644
1.0.2
--------
Last issue: 27
TODO:
1. Allow the user to define csv configuration (delimiter, decimal point) and pass it to convert.to.fadn.raw.rds
4. Write a use case where the function convert.to.fadn.str.rds is used to recalculate the raw->str conversion (in order someone change the map manually)
6. Add the option of encrypting the rds files, see here https://stackoverflow.com/questions/52851725/how-to-protect-encrypt-r-objects-in-rdata-files-due-to-eu-gdpr
8. Provide the option to copy rds content from other data.dir directories
11. On 'manage_data_dir.R > overwrite.raw_str_map.file', re-run all convert.to.str.rds operations (currently only the replacement of the file is taking place)
12. Add the possibility for the user to add a column description of the fadn.raw data (providing a text file)
13. Add the following feature: An R-shiny application for browsing loaded fadn.raw. The user can start this with a simple command.
14. Throw a warning message if load.fadn.{raw,str} does not load anything. Say "No files found to belong to this country and years. Nothing loaded"
15. Create a filter.fadn.str. It will take a fadn.str and a filter(for data.table) expression and will keep only the records for info,costs, crops
16. On convert_data > convert.to.fadn.str.rds, use tryCatch() to report the error and not fail
17. In the raw_str_map.json file, provide the option to define factor levels for a variable
18. Provide the ability to delete country/years from the raw/str files
23. Give the possibility to load str.data passing some filtering for an ID field for str.fadn. In load.fadn.str.rds function
26. Save the SExxx variables to the dat.fadn list object (create an entry in raw_str_maps and add code in the convert.to.fadn.str.rds function)
27. Keep the raw.fadn.rds also in a long format (sparse matrix). Ability to select how to load (wide or long format). Need to know which variables are numeric and which are strings. Keep them in different DT. Long format will return a list with one DT with the numeric values and one with the string values.
CHANGES UNDER WAY:
21. Provide the ability to use an external raw_str_map file (use it and copy it to raw_str_maps).
22. Add the content of the raw_str_map used for convert.to.fadn.str.rds in the attribute of the rds data.
9. Provide full documentation of raw_str_map.json specification (already some in the doc of convert.to.fadn.str.rds function)
CHANGES COMPLETED: (In date-completed descending order / newer changes on the top)
28. Utility function: Update an fadn.raw.rds file with external data (rows of id-column-new value). Load the data and update them with the new values.
27. Give the possibility to load raw.data with row selection based on a criterion (examples: column_x == xxx; column_x>xxx, etc. ) In load.fadn.raw.rds function
24. Provide the ability when load.fadn.raw to pass a vector of columns to load (and discard the rest)
25. added a DEBUG mode for convert.to.fadn.str.rds (detailed information on what is calculated is shown)
20. Write a function that merges two raw_str_map.json files. It will be used if one wants to have a basic raw_str_map and wants to make marginal changes for a specific case (year or country)
19. Provide the ability to use more than one raw_str_map.json (create.data with a vector of raw_str_map.json files, show in contents the raw_str_map.json files, check data dir straucture changes, ,convert with specifying which)
2. Save loaded data to stored.rds.data.RData added store.rds.data function, restore.rds.stored.data function, also show the saved.data rds in the show.data.dir.contents)
3. Provide the option to provide a file with the description of the variables for the fadn.str.rds files (data.dir specific). Probably alter the raw_str_map.json specification
5. Create a folder spool, where the users can put relevant files
1. Make the map_definition an organic part of the fadnUtils.data.dir
2. Add the option of storing/not storing the original csv from DG AGRI in the data.dir folder
10. On 'load.fadn.str.rds', output the message "Loading from ..." with <cat> instead of <print>
====================================================
OLD
====================================================
1.0.1
--------
CHANGES:
1. Keep data in folder, not included in the package
*.html
*.R
---
title: "fadnUtils"
author: ""
output: #rmarkdown::html_vignette
word_document
vignette: >
%\VignetteIndexEntry{fadnUtils}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>"
)
```
# Introduction
The fadnUtils package facilitates the efficient handling of FADN data within the R language framework. Furthermore, the package is targeted for use within the JRC D.4 context. This means that there is a specific temporal pattern of how a user interacts with the package (see Figure \@ref(fig:foo)).
More specifically, after a request for FADN data from DG-AGRI, this data is delivered to JRC D.4 in csv format.
```{r foo, echo=FALSE, fig.cap="Overview of how the user interacts with the package.",fig.align = 'center', out.width = '100%'}
knitr::include_graphics("pic/workflow.png")
```
# Installation
`fadnUtils` and Related R packages can be installed.
```{r results='hide', message=FALSE, warning=FALSE}
requiredPackages = c('fadnUtils','data.table', 'devtools','jsonlite', 'ggplot2')
for(p in requiredPackages){
if(!require(p,character.only = TRUE)) install.packages(p)
library(p,character.only = TRUE)
}
```
# Usage in Brief
After loaded the packages, you will have a functinal R package on your computer. Then, we will talk about using your package online.
1. Create a working directory
- a user-defined data directory
1. Import CSV FADN data
- convert the csv data into raw r-data
- convert raw r-data into str r-data
1. Load r-data and structured r-data
1. Perform analysis
## 1. Create a working directory
Frist, User sets a working directory. Make sure the relative path stays within `CurrentProjectDirectory`.
```{r}
# using a local directory
CurrentProjectDirectory = "D:/public/yang/MIND_STEP/New_test_fadnUtils"
create.data.dir(folder.path = CurrentProjectDirectory)
set.data.dir(CurrentProjectDirectory)
get.data.dir()
```
### Required files
We request FADN data from DG-AGRI, which is delivered to us in csv format. In order to work efficiently with R, we should convert the csv-data to an r friendly format, this step is done with help of a human-readable file, called `raw_str_map.file`. Both files are necessary.
1. FADN data in csv format: the data for loading
2. A json file for extracting the variables
### Folder Structure
A working directory is specified arbitrarily by the user. This structure helps data management and maintenance. The directory looks like this:
```base
CurrentProjectDirectory/
+-- csv
+-- fadnUtils.metadata.json
+-- rds
\-- spool
\-- readme.txt
```
* csv: CSV files are stored here
* fadnUtils.metadata.json: containing the mapping from the fadn.raw.rds to the fadn.str.rds data
* rds: placing r-data in the "rds" directory
* spool: keeping related files
## 2. Import CSV FADN data
First, we will import the data into an R-friendly format using the fadnUtils package.
### Convert the csv data into raw r-data
The raw data will be added to a `rds` directory. We use a convenient function from this package to convert the csv file into raw r-data.
```{r}
convert.to.fadn.raw.rds(
file.path = "D:/public/yang/MIND_STEP/Fake_Data/BEL2009.csv",
sepS = ",",
fadn.country = "BEL",
fadn.year = 2009,
#keep.csv = T # copy csv file in csv.dir
col.id = "id"
)
```
At any time, we can check for the current data dir, what csv files (countries, year) are loaded.
```{r}
show.data.dir.contents()
```
### Convert raw r-data into structured r-data
Then, We convert raw data into structured data. Broadly, there are 3 steps to including data in an R package:
1. setting a structured data in the `structured` directory,
2. checking the `raw_str_map.file` that all variables can be converted.
3. converting the structured data successfully into `structured` directory.
#### Set a `structured` directory for saving the structured data
We set a `test` folder to placing the structured data.
```{r}
rds.dir = paste0(get.data.dir(),"/rds/")
# set a structured name for for saving the structured r-data in rds.dir
new.str.name = "test"
# set a extraction_dir
dir.create(paste0(rds.dir, new.str.name))
new.extraction.dir = paste0(rds.dir, new.str.name)
```
#### Check the variables in the `raw_str_map.file`
Before conversion it is recommended to use `check.column()` method, ensuring that all variables in the`raw_str_map.file` can be converted.
```{r results='hide', message=FALSE, warning=FALSE}
list_vars = check.column(
# a rds file or a csv file
importfilepath = paste0(rds.dir, "fadn.raw.2009.BEL.rds"),
# a json file
jsonfile = "D:/public/yang/MIND_STEP/2014_after_copy.json",
# write a new json file without unmatched variables
rewrite_json = T,
# save the new json in extraction_dir
extraction_dir = new.extraction.dir)
```
#### Convert the raw data into structured r-data using the checked json file
Finally, We can convert a raw r-data to str r-data using a external json file. For more details on converting in fadnUtils packages, `see USE_CASE.R`.
```{r}
convert.to.fadn.str.rds(fadn.country = "BEL",
fadn.year = 2009,
raw_str_map.file = "D:/public/yang/MIND_STEP/new_sample/test01/raw_str_map.json", # a external json file
str.name = new.str.name, # extraction_dir
force_external_raw_str_map = T,
DEBUG = F
)
```
#### Files Structure in `rds` folder
After conversion, we can see the `rds` folder:
* `fadn.raw.2009.BEL.rds`: raw r-data for country "BEL" and year "2009"
* `test`: extraction_dir for saving the structured r-data and extracting json file
* `fadn.str.2009.BEL.rds`: structured s-data for for country of "BEL" and year of "2009"
* `raw_str_map.json`: default json file
* `rewrite_2014_after_copy.json`: modified json file after checking the variables
```base
rds
+-- fadn.raw.2009.BEL.compressed.rds
+-- fadn.raw.2009.BEL.rds
+-- fadn.raw.2010.BEL.compressed.rds
+-- fadn.raw.2010.BEL.rds
+-- fadn.raw.2011.BEL.compressed.rds
+-- fadn.raw.2011.BEL.rds
+-- fadn.raw.2012.BEL.compressed.rds
+-- fadn.raw.2012.BEL.rds
\-- test
+-- fadn.str.2009.BEL.rds
+-- raw_str_map.json
\-- rewrite_2014_after_copy.json
```
## 3. Load raw r-data and structured r-data
In order to initiate any analysis with `fadnUtils`, we first need to load r-data. We can only load data for countries and years that that has already been imported into a data.dir folder.
### Load raw r-data for the country `BEL` and year `2009`
```{r}
my.data.2009.raw = load.fadn.raw.rds(
countries = "BEL",
years = 2009
)
```
### Load structured data for the country `BEL` and year `2009`
We can load structured from country `BEL` and year `2009`.
```{r}
my.data.2009.str = load.fadn.str.rds(
countries = "BEL",
years = 2009,
extraction_dir = "test" # Location of the str r-data
)
```
### Load structured data from all available countries and years.
The following is an example of loading structured data all available countries and years.
```{r}
my.str.data = load.fadn.str.rds( extraction_dir = "test")
```
## 4. Perform analysis
Here are some examples to perform data.
### Collection the common id
We can collect the common id from the loaded r-data using `collect.common.id()` function on `fadnUtils`.
```{r, message=FALSE}
# Collection the common id from loaded structured r-data
collected.common.id_str = collect.common.id(my.str.data)
```
### Plot
To build a basic plot, we will use the `ggplot` function using the plotting package
`ggplot2`.
```{r}
crops.data = my.str.data$crops #catering for easier access at next steps
#this contains the number of crops for each farm-country-year/
# Be carefule, we hav to filter to count only the LEVL variable
crops.data.Ncrops = crops.data[VARIABLE=="LEVL",.N,by=list(COUNTRY,YEAR,ID)]
# This displays the quantiles of the number of crops
crops.data.Ncrops[,as.list(quantile(N)),by=list(YEAR,COUNTRY)][order(COUNTRY)]
ggplot(crops.data.Ncrops,aes(y=N,x=1)) +
geom_boxplot() +
facet_grid(YEAR~COUNTRY) +
theme(axis.title.x=element_blank(),
axis.text.x=element_blank(),
axis.ticks.x=element_blank()
)+
ylab("Number of Crops")
```
### Some other examples
```{r}
# sample and representend number of farms
my.str.data$info[,list(Nobs_sample=.N,Nobs_represented=sum(WEIGHT)),
by=.(COUNTRY,YEAR)]
# only for full sample (common id over years in selected data)
my.str.data$info[id %in% collected.common.id_str[[1]],
list(Nobs_sample=.N,
Nobs_represented=sum(WEIGHT)),
by=.(COUNTRY,YEAR)]
```
**Notices:** Please read `Use_CASE.R` for more details on using fadnUtils.
File added
vignettes/pic/workflow.png

44.2 KiB