Datu vizualizācijas pamati(English)(1),23/24-P

Data visualization is an integral part of data analysis and data science. It provides both exploratory data analysis and the presentation of data and data analysis results to different audiences. During the study course, students acquire data visualization with the programming languages Python and R, which are the most important programming languages in data science, as well as acquaint with the data visualization tools Power BI and Tableau. Data visualization in R can be done with the tools in its base package. However, the use of extra packages provides enhanced visualization capabilities. During the course students learn the ggplot2 package for the R package (and the plotnine library - its analogue for Python), with which it is possible to create charts and diagrams for displaying one variable, two or more variables. Also, the libraries matplolib and seaborn for the Python language are included in the course. Students also introduce specific packages for interactive and spatial data visualization as well as data visualization tools Power BI and Tableau. In parallel with data visualization, students also learn simple manipulations with data. While developing the independent work students learn packages RMarkdown, knitr, and Web application Jupyter Notebook which support creating the report document and commenting the software code during the development of visualization, as well as exporting the document in various formats, thus providing an reproducible data analysis. All course classes take place in a computer class, thus giving students the opportunity to immediately use their knowledge in practice, and the theoretical material of the lecture is supplemented with the tasks to be solved during the classroom.