Problem 1 (50 pts). This problem will involve the nycflights13 dataset (including tables airlines, airports, planes...

60.1K

Verified Solution

Question

Programming

Problem 1 (50 pts). This problem will involve the nycflights13dataset (including tables airlines, airports, planes and weather),which we saw in class. It is available in both R and Python,however R is recommended for at least the visualization portion ofthe question. Start by installing and importing the dataset to yourchosen platform. We will first use joins to search and manipulatethe dataset, then we will produce a flightpath visualization.

Question e) Produce a map that colors each destination airportby the average air time of its incoming flights. Here is a codesnippet to draw a map of all flight destinations, which you can useas a starting point. You may need to install the maps packages ifyou have not already. Adjust the title, axis labels and aestheticsto make this visualization as clear as possible. Hint: You may findit useful to use a different type of join in your solution than theone in the snippet.

airports %>% semi_join(flights, c(\"faa\" = \"dest\")) %>%ggplot(aes(lon, lat)) + borders(\"state\") + geom_point() +coord_quickmap()

Answer & Explanation Solved by verified expert
4.3 Ratings (641 Votes)
Relational data relationaldata r4dssection Introduction introduction7 r4dssection The datamodelr package is used to draw database schema r packagemessageFALSEcacheFALSE librarytidyverse librarynycflights13 libraryviridis librarydatamodelr nycflights13 nycflights13relational r4dssection Exercise 1321 unnumbered exercise datanumber1321 Imagine you wanted to draw approximately the route each plane flies from its origin to its destination What variables would you need What tables would you need to combine Drawing the routes requires the latitude and longitude of the origin and the destination airports of each flight This requires the flights and airports tables The flights table has the origin origin and destination dest airport of each flight The airports table has the longitude lon and latitude lat of each airport To get the latitude and longitude for the origin and destination of each flight requires two joins for flights to airports once for the latitude and longitude of the origin airport and once for the latitude and longitude of the destination airport I use an inner join in order to drop any flights with missing airports since they will not have a longitude or latitude r flightslatlon innerjoinselectairports origin faa originlat lat originlon lon by origin innerjoinselectairports dest faa destlat lat destlon lon by dest This plots the approximate flight paths of the first 100 flights in the flights dataset r flightslatlon slice1100 ggplotaes x originlon xend destlon y originlat yend destlat bordersstate geomsegmentarrow arrowlength unit01 cm coordquickmap labsy Latitude x Longitude Exercise 1322 unnumbered exercise datanumber1322 I forgot to draw the relationship between weather and airports What is the relationship and how should it appear in the diagram The column airportsfaa is a foreign key of weatherorigin The following drawing updates the one in Section 132httpsr4dshadconzrelationaldatahtmlnycflights13relational to include this relation The line representing the new relation between weather and airports is colored black The lines representing the old relations are gray and thinner r echo FALSE outwidth NULL purl FALSE knitrincludegraphicsdiagramsnycflightspng Exercise 1323 unnumbered exercise datanumber1323 Weather only contains information for the origin NYC airports If it contained weather records for all airports in the USA what additional relation would it define with flights If the weather was included for all airports in the US then it would provide the weather for the destination of each flight The weather data frame columns year month day hour origin are a foreign key for the flights data frame columns year month day hour dest This would provide information about the weather at the destination airport at the time of the flight take off unless the arrival datetime were calculated So why was this not a relationship prior to adding additional rows to the weather table In a foreign key relationship the collection of columns in the child table must refer to a unique collection of columns in the parent table When the weather table only contained New York airports there were many values of year month day hour dest in flights that did not appear in the weather table Therefore it was not a foreign key It was only after all combinations of year month day hour and airports that are defined in flights were added to the weather table that there existed this relation between these tables Exercise 1324 unnumbered exercise datanumber1324 We know that some days of the year are special and fewer people than usual fly on them How might you represent that data as a data frame What would be the primary keys of that table How would it connect to the existing tables I would add a table of special dates similar to the following table r specialdays arrangeyear month day scheddeptime carrier flight mutateflightid rownumber glimpse Exercise 1332 unnumbered exercise datanumber1332 Identify the keys in the following datasets 1 LahmanBatting 1 babynamesbabynames 1 nasaweatheratmos 1    See Answer
Get Answers to Unlimited Questions

Join us to gain access to millions of questions and expert answers. Enjoy exclusive benefits tailored just for you!

Membership Benefits:
  • Unlimited Question Access with detailed Answers
  • Zin AI - 3 Million Words
  • 10 Dall-E 3 Images
  • 20 Plot Generations
  • Conversation with Dialogue Memory
  • No Ads, Ever!
  • Access to Our Best AI Platform: Flex AI - Your personal assistant for all your inquiries!
Become a Member

Other questions asked by students