/r/dataisbeautiful
DataIsBeautiful is for visualizations that effectively convey information.
Aesthetics are an important part of information visualization, but pretty pictures are not the sole aim of this subreddit.
A place to share and discuss visual representations of data: Graphs, charts, maps, etc.
DataIsBeautiful is for visualizations that effectively convey information. Aesthetics are an important part of information visualization, but pretty pictures are not the sole aim of this subreddit.
A post must be (or contain) a qualifying data visualization.
Directly link to the original source article of the visualization
[OC] posts must state the data source(s) and tool(s) used in the first top-level comment on their submission.
DO NOT claim "[OC]" for diagrams that are not yours.
All diagrams must have at least one computer generated element.
No reposts of popular posts within 1 month.
Post titles must describe the data plainly without using sensationalized headlines. Clickbait posts will be removed.
Posts involving American Politics, or contentious topics in American media, are permissible only on Thursdays (ET).
Posts involving Personal Data are permissible only on Mondays (ET).
Please read through our FAQ if you are new to posting on DataIsBeautiful.
Don't be intentionally rude, ever.
Comments should be constructive and related to the visual presented. Special attention is given to root-level comments.
Short comments and low effort replies are automatically removed.
Hate Speech and dogwhistling are not tolerated and will result in an immediate ban.
Personal attacks and rabble-rousing will be removed.
Moderators reserve discretion when issuing bans for inappropriate comments. Bans are also subject to you forfeiting all of your comments in this subreddit.
Do you like contributing sharp-looking graphs? Are you an official practitioner or researcher? Read about what kind of flair is right for you!
Data from Star Trek? Data ARE? How do I make one? Read the FAQ
How do I make a good post? Read the guide
If you want to post something related to data visualization but it doesn't fit the criteria above, consider posting to one of the following subreddits:
SampleSize: Conduct and share surveys
Datasets: Request and share data sets
DataVizRequests: Request a visualization to be made from a dataset
Visualization: Discuss and critique the design and construction of information visualizations
MapPorn: Share interesting maps, map visualizations, etc.
Infographics: Share infographics and other unautomated diagrams
WordCloud: Specifically for sharing word clouds
Tableau: Share and discuss visualizations made with Tableau software
U.S. Data is Beautiful: for those of us who simply can't wait for Thursdays
MathPics: Share pictures and visualizations of mathematical concepts
RedactedCharts: Try to guess what a chart is about without the labels
Statistics: For all questions and articles related to statistics
data_IRL: Feeling the need to be hilarious? Go here. Data.
COVID19_data: More data visualizations about the COVID-19 pandemic
DataArt: A place for data visualizations which blur the line between art and data
Get the day's top posts on Twitter!
Sister subreddit: InternetIsBeautiful
/r/dataisbeautiful
I loaded three fact checking analyses from Politifact (links below) into a workspace for analysis and classified the findings as: True, needs context, misleading and false.
I extracted the information from the text of each analysis and sorted each claim into one of three four categories based on the Politifact analysis.
I am not super political but this data is fascinating and I’d love to repeat it across multiple different fact checking sources. This post is intended to be a visualization from one source across three election cycles. However, there are sooo many other fact checking sources out there. My request is that you drop a link to any you think I should include in a comment below and I will repeat this visualization for each source and post a link to any fact checking sites you think would be interesting to visualize.
Data source: https://www.ispo.com/en/health/sport-health-retail-9-game-changing-mindshifts-future
This is the only pan-European survey of over 1,800 people that focuses specifically on the interaction between sport and health in order to provide information and guidance for decision-makers in sport and healthcare. The white paper also clarifies the potential in retail, why corporate health is important and where the sports and healthcare industry need to address the unfulfilled expectations and deficits of the healthcare system.
Data: https://en.wikipedia.org/wiki/Lists_of_school_shootings_in_the_United_States
Tools: R
General description: each cell represents the total for that day of the week of individuals injured or killed in a school shooting (see definitions in the source above). I started with looking at day of week within week of month within month of year but it was way too difficult to read and I probably would have had to more heavily process the data (due to nulls across weekends and other considerations with how to count a week). This analysis involved aggregating data on school shootings by date, calculating the total incidents per day, and extracting date components like the week of the month and the year. A calendar heatmap was then created using a color gradient from light purple to dark yellow to visually represent the frequency of incidents across different decades, months, and weeks within the month, with outliers mitigated using log scaling. I ordered it by August through July to sort of go from start to end of school year, and put the day of the week starting on Monday to better illustrate the first day of school for the week.
If people are interested I will post the data/code.
And here is one where I adjusted it to be just week of year. Again, more legible than both day of week and week of year but I felt day of week was more interesting. https://imgur.com/QdGeGIs
The US Census Bureau just released official income data characteristics for 2023. Here is a visual on some of these by me. See source for defintions of characteristics.
##How to read (Example):
"High School Diploma": Median Household Income for a household where householder has a high school diploma (but no college) is $55,810. 24.1% of ALL US households fall into this category.
Source: https://www.census.gov/library/publications/2024/demo/p60-282.html
Tools: Created with excel.
Cardinals had a probability of winning close to 80% when they we're leading the Game by 17 points. Bills change their probability up to 60% when they scored their second TD.
Data from R package nflreadR. I created a python function to automate the process by taking only the home team and the date to return the plot with the logos, lines and scores. Any feedback is apreciated.
Source article: https://www.csmonitor.com/Environment/2024/0909/air-conditioning-heat-wave-electricity-costs
Data Source: https://www.eia.gov/consumption/residential/ https://www.eia.gov/electricity/monthly/epm_table_grapher.php?t=epmt_5_6_a
Tools: Datawrapper, Adobe Illustrator
Don't think this has been done before (words spoken yes, but not lines per character)
Observations:
5 Characters with highest average lines across all episodes:
*next best Kevin with 8 lines, 12 lines less per episode. These 5 characters where by far the most present in the series
Creed has it's own fanbase with only 2.1 lines per episode as well as Nate 0.3 Lines per episode.
5 Characters with highest average lines only in episodes with minimum 1 line (excluding episode where they not appear or speak at all):
Notes:
source: https://transcripts.foreverdreaming.org/viewforum.php?f=574 tool for scraping and viz: Python
let me know what you think