• R/O
  • SSH

Commit

Tags
No Tags

Frequently used words (click to add to your profile)

javac++androidlinuxc#windowsobjective-ccocoa誰得qtpythonphprubygameguibathyscaphec計画中(planning stage)翻訳omegatframeworktwitterdomtestvb.netdirectxゲームエンジンbtronarduinopreviewer

Commit MetaInfo

Revisióna706cae975c6cffdf33bdf5f9456c6d349bd3733 (tree)
Tiempo2022-11-11 04:09:18
AutorLorenzo Isella <lorenzo.isella@gmai...>
CommiterLorenzo Isella

Log Message

I now also remove the cases which are wrong in tam (all the numbers about aid are missing).

Cambiar Resumen

Diferencia incremental

diff -r cc26971bf365 -r a706cae975c6 R-codes/create_tam_parquet.R
--- a/R-codes/create_tam_parquet.R Thu Nov 10 16:58:32 2022 +0100
+++ b/R-codes/create_tam_parquet.R Thu Nov 10 20:09:18 2022 +0100
@@ -4,6 +4,7 @@
44 library(janitor)
55 library(lubridate)
66 library(stringr)
7+library(openxlsx)
78
89 source("/home/lorenzo/myprojects-hg/R-codes/stat_lib.R")
910
@@ -166,18 +167,16 @@
166167 "granted_value_extended_eur" ,
167168 "nominal_value_extended_eur" ,
168169 "is_covid_case"
169- )
170+ )|> ### remove the wrong cases
171+ filter(!is.na(granted_aid_absolute_eur) |
172+ !is.na(nominal_aid_absolute_eur) |
173+ granted_range_eur!="0 - "
174+ )
170175
171176
172177
173178
174179
175-## write_dataset(
176-## df_new,
177-## format = "csv",
178-## path = "./data_output/",
179-## max_rows_per_file = 1e7
180-## )
181180
182181
183182 write_dataset(
@@ -188,38 +187,15 @@
188187 )
189188
190189
191-## test <- df_new |>
192-## filter(granted_value_extended_eur==0) |>
193-## collect()
194-
195190
196-## test2 <- df_new |>
197-## filter(nominal_value_extended_eur==0) |>
198-## collect()
199-
200-## test3 <- df_new |>
201-## filter(nominal_aid_absolute_eur==0) |>
191+## cases_wrong <- df_new |>
192+## filter(is.na(granted_aid_absolute_eur),
193+## is.na(nominal_aid_absolute_eur),
194+## granted_range_eur=="0 - "
195+## ) |>
202196 ## collect()
203197
204-
205-## test4 <- df_new |>
206-## filter(granted_aid_absolute_eur==0) |>
207-## collect()
208-
209-
210-## test5 <- df_new |>
211-## filter(is.na(nominal_aid_absolute_eur)) |>
212-## collect()
213-
214-## test6 <- df_new |>
215-## filter(is.na(granted_aid_absolute_eur)) |>
216-## collect()
217-
218-
219-## test7 <- df_new |>
220-## filter((nominal_aid_absolute_eur==0) & (granted_aid_absolute_eur==0)) |>
221-## collect()
222-
198+## save_excel(cases_wrong, "tam_errors.xlsx")
223199
224200
225201