/r/learnSQL
Adventures in the most popular database language
Sequel
/r/learnSQL
Hi all.
I have a query that is quite complex in PostgreSQL. By complex, I mean it has a lot of joins, counts, CTEs, etc. I also believe this is the best way to get my desired results with the current DDL I am working with. I won't paste the query itself, however I will show an anonymized part of it that seems to be where it is taking a long time. Just a little rundown:
1. The first CTE is a union between two tables, this part does not take a long time at all to execute.
2. The query that I perform on this CTE above, its goal is to retrieve the records belonging to a group with the highest date column.
This is what this looks like:
-- foo CTE: Union between two tables, different logic for joins / where, etc.
WITH foo AS (
SELECT
col1 AS col1,
col2 AS col2,
col3 AS col3,
col4 AS col4,
col5 AS col5
FROM tbl1 t1
JOIN t1 ...
JOIN t2 ...
WHERE
...
UNION ALL
SELECT
col1 AS col1,
ccol2 AS col2,
col3 AS col3,
col4 AS col4,
col5 AS col5
FROM tbl2 t2
JOIN t1 ...
JOIN t2 ...
WHERE
...
)
SELECT
DISTINCT ON (f.col1, f.col2, f.col3)
f.col1,
f.col2,
f.col3,
f.col5
FROM foo f
ORDER BY
f.col1,
f.col2,
f.col3,
f.col4 DESC -- This is where the logic is to pull the record having the highest col4 (date column), from the distinct group of col1, col2, and col3.
;
Here is where my issue is. Because of the complexity of the source data, a CTE seems the only viable approach. This CTE will return a couple million records, growing as time goes on, and the logic to do the distinct on / order by takes forever. I cannot index a CTE.
I've tried making that CTE a temporary table instead, but just creating the temporary table itself takes too long. I need this logic to be returned at endpoint level speeds, and the logic above is just one part of many albeit the part that is taking by far the longest, how can I achieve this?
Thank you!
I have been looking to learn SQL efficiently and found three websites being recommended for beginners: SQLZOO, SQLBolt and LearnSQL. They all look roughly the same on the topics they cover, but what’s the difference in the things they teach and the way they do it? Which one you found to be the „best”?
Hi,
I am looking for workshops on SQL (3+ days duration) in the US before year end. Preferably Atlanta but open to other states.
Must be in person, no virtual classrooms or online courses.
Any recommendations appreciated.
I just finished Jose Portilla's Sql bootcamp. It was great but after finishing I feel like I only learned beginner to intermediate concepts. Does anyone know a course like this that can teach me more advanced concepts?
I wanted to share something I've created that might help with learning SQL. It's called "Advent of SQL" - a collection of 24 daily PostgreSQL challenges for this December, designed to be learning-friendly! Basically advent of code but only for SQL 😅
What it is:
I remember how overwhelming SQL can be when you're learning, so I've tried to make these challenges accessible while still being interesting. Hopefully there's something for everyone here.
The challenges will be available at adventofsql.com starting December 1st.
I'd love to hear what kinds of SQL concepts you find challenging or what topics you'd like to practice. Feel free to ask any questions about the format or difficulty level!
I'm having a blast building out these challenges so I hope you enjoy them, and I hope this helps someone.
Happy learning! 🎄
Hi! I have a long list of list with Code1 and Code2. I need to compare this with a table which has fields Code1, Code2 and True. In the extract, I need Code1s which only meet Cases 3 and 4.
I have tried different things but cannot get the correct output. All help is appreciated and TIA!
Code1 | Code2 | True? | Cases | Case Explanation |
---|---|---|---|---|
3326659 | AA665544 | Y | Case 1 | Code2 exists only with True=Y |
1245789 | AB665665 | Y | Case 2 | Code2 exists more than once with atleast one row with True=Y |
1545789 | AB665665 | N | Case 2 | Code2 exists more than once with atleast one row with True=Y |
6655447 | AC225545 | N | Case 3 | Code2 exists only with True=N |
9584750 | Case 4 | Code2 does not exist |
I started a series of videos teaching SQL for beginners.
I had this idea, that embedded SQL is a perfect start, because there is no need to install any databases, any database viewers, or sign up for any sandboxes or playgrounds online. Just launch your terminal, and go and practice. So it would remove a barrier to start, and then later all the installs etc would come.
I made a couple of videos, and now I would like to ask you, if that works?
And if this idea doesn't work, what would work better (for you) instead?
Maybe using a terminal is scary and already a big barrier for a beginner? Or maybe installing a database or signing up for some cloud is not a barrier for you at all?
Really curious to hear what is your take on this.
Here's the link to my latest (second) video:
https://youtu.be/HSDWHmD3Ooo
I'm reading Learn SQL in 24 Hours by Ryan Stephens and I'm trying my best to not give up --
I've successfully downloaded Oracle and have access to SQLPLUS -- but the first directions say to create a table based on the supplemental downloaded material
Earlier in the book there was a link to the books website that had 4 downloads that are called tables.sql, data.sql and 2 bonus materials of similar names
when I download they are Pdf's -- now here's where I'm stuck
the directions say to use the following code to load the table into the database...
SQL> start c:\sqlbook\tables.sql
The issue i'm running into is that this (see attached photo) -- I'm such a newbie that I'm not sure if these directions are written badly or if I just didn't do something right (and yes, I voluntarily chose to learn SQL)
Any help?
Edit: Forgot Picture SMH
I’m learning SQL from a book and got ChatGPT to set me some practice exercises. Now I tried to create a practice database in MySQL Workbench but it has a visual interface where everything is prepared and you just click buttons. What free program can I use where I have to type the actual code myself and then it lets me see my stuff? I know the code for creating database, tables, update the table, do some simple and nested queries. Thanks for your help!
Hi All. I have a SAS program that runs 16 subprograms. I ideally would like for the first program to call all the subprograms and I click the first program and it runs all. I am new to using SAS. Any help ?
I'm gonna insert large number of data into the table I need to find number of rows inserted successfully and number of rows that failed to insert
This should be a simple question for you guys out there. I'm a beginning and doing a training class in SQL, and ran into this query. This query solves the problem, but I don't understand WHY it works. Specifically the date range. The problem asks for items that were NOT SOLD, yet the and statement is taking what WAS SOLD between those dates. Can anyone help explain?
Here is the goal:
For products that weren't sold even once between 2015-02-01 and 2015-02-05, show the product name (rename the column to product_name
), it's price and the producer's name (rename the column to company_name
). You should display all products that haven't been sold in this interval, also the ones that don't belong to any company.
As background:
This is a grocery store-type database, with 3 relevant tables.
product (id, name, department_id, shelf_id, producer_id, price)
Sales_History (date, product_id, amount)
Producer (id, name)
This is the query that works:
select
p.name
as product_name,
p.price,
prod.name
as company_name
from product p
left join sales_history sh
`on sh.product_id =` [`p.id`](http://p.id)
and
sh.date
between '2015-02-01' and '2015-02-05'
left join producer prod
`on p.producer_id =` [`prod.id`](http://prod.id)
where sh.product_id is null;
Hi folks! Curently having 0 experience in SQL or programming whatsoever, I am interested in developping a database with GUI. Would help me a lot some recommandations regarding a developping tool that would:
The database would have some manual input and should be able to determine certain changes based on those inputs, generate different kind of reports and export them in XCEL, give notifications through the GUI regarding certain events (sorry if I used a rough language xD). Thank u all in advance!
https://www.youtube.com/watch?v=AZ29DXaJ1Ts
There is this 36 minute video project. Is it worth being put on my resume as a data intern. I am currently a supprot engineer trying to get roles as a data anlayst.
Or how long did it take from the beginning to feel like you knew what you were doing?
I'm working on a personal project with SQLite and I'm having trouble understanding the best way to use joins. I have a number of main entries which each have from 0-N media entries and each media entry has from 1-N part entries. So it's many parts linked to 1 media and many media linked to 1 main entry. Here's a diagram https://drawsql.app/teams/na-756/diagrams/test. What's the best way to use joins to get all the partTable.fileLocations that link to a mainTable.plexID? Are joins even the best/right option?
I'm just working on a side project for experience and my portfolio, and I have an SQL Server database that's about 120MB. I'd rather not pay for it if I can avoid it, but every service I've seen so far is at least $5 a month.
Hey everyone. I'm currently learning Databricks and I have a hard time differentiating where standard SQL stops and Databricks begins. Databricks is built on top of SQL (from what I understand). In this example the select statement is selecting the details column, but there's a colon afterwards which seems odd for SQL, but maybe idk if SQL can really do this.. Just curious
Processing img 3vn1ghqjhzvd1...
Hey folks!
I am currently in a sys-admin role since 3 y, with a data analysis component, but it feels like a dead-end job and i am lacking a perspective.
My basic education is a technical university degree in environmental engineering and I therefore have no basic database knowledge. In order to enhance my career options in IT, I will put my focus on this topic in the next few months to make progress as quickly as possible. Best case:
I will start with this e-book, in combination with tutorials:
https://datubaze.wordpress.com/wp-content/uploads/2020/04/celkos-sql-for-smarties-2005.pdf
I would be happy if I could create updates here from time to time and build up a checklist in diary format.
Why i do that? Because i found a quite suitable job description, which matches quite good, but i lack the necessary mySQL and database knowledge. No excuses - so i´ll start today. Let´s try how fit i can make myself in 30 days. Maybe it´s also beneficial on the actual job.
19.10.2024:
20.10.2024:
21.10.2024 - 25.10.2024:
Completely knocked out by a flu. Restart the grind today.
26.10.2024:
Update in 24h.
Hello everyone!
So I posted here a few days ago about a free coupon for my SQL problem solving course on Udemy, but all spots were redeemed so quickly and a few people reached out to me because they saw the post late and the coupon was already expired, so this time here's what I'll do, I have another free 100 spots that you can DM me and get yours but please only DM me if you really need this and don't share it with random people because I've found a small percentage actually engaged with the course after getting their spot.
So to sum up, DM me if:
1- you need this to actually prepare for an interview or expand your SQL problem solving skills.
2- you can't actually afford it at the moment
Otherwise, you can buy it here
Also, please upvote or leave a comment on this post so it reaches everyone.
Sorry for the long post and thank you!
As the title, I want to download as much datasets as possible which have various tables and columns to practice SQL query. I've looked through kaggel but for example, sales dataset show mostly dataset with only 1 csv file with like 5 ,6 column. I really want to train my querying skill from multiple tables but currently Im unable to find a good sources of dataset. Can anybody please help me , thanks.
I want to reset a verified boolean to false if the person the number is assigned to deletes their profile
CREATE TABLE phone_numbers (
phone_number_id BIGINT PRIMARY KEY GENERATED ALWAYS AS IDENTITY,
phone_number VARCHAR NOT NULL,
verified BOOLEAN DEFAULT false,
assigned_to BIGINT REFERENCES personnel ON DELETE SET NULL,
)
How do I perform a simple check or case function to say: "if 'assigned_to' is NULL, set 'verified' to false"?
Is this possible to set in a CREATE TABLE statement, or must it occur separately?
Like to know what I'm doing wrong here.
Link to HackerRank problem - https://www.hackerrank.com/challenges/full-score/problem?isFullScreen=true
Now here is my query to the solution (I like to use CTE as much as possible due to readability and modularity of code. Also a lot of interviews expect you to write this way)
WITH hacker_full_score AS(
--count the number of challenges with full score for hackers
SELECT h.hacker_id, h.name, COUNT(*) AS challenges_full_score
FROM submissions AS s
INNER JOIN hackers AS h
ON s.hacker_id = h.hacker_id
INNER JOIN challenges AS c
ON c.challenge_id = s.challenge_id
INNER JOIN difficulty AS d
ON d.difficulty_level = c.difficulty_level
WHERE d.score = s.score
GROUP BY h.hacker_id
HAVING challenges_full_score > 1
)
SELECT hfs.hacker_id, hfs.name
FROM hacker_full_score AS hfs
ORDER BY hfs.challenges_full_score DESC, hfs.hacker_id ASC;
Upon executing I get this error:
Error message upon CTE style execution
I ran it through ChatGPT and found no errors. So I ask it to frame without CTE which magically works. Come to notice its the same thing except the CTE.
Here is what worked:
SELECT h.hacker_id, h.name
FROM submissions AS s
INNER JOIN hackers AS h
ON s.hacker_id = h.hacker_id
INNER JOIN challenges AS c
ON c.challenge_id = s.challenge_id
INNER JOIN difficulty AS d
ON d.difficulty_level = c.difficulty_level
WHERE d.score = s.score
GROUP BY h.hacker_id
HAVING COUNT(*) > 1
ORDER BY COUNT(*) DESC, h.hacker_id ASC;
I have submitted previous solutions using CTE so I know the platform supports it. I'm not sure why is it doing this for this solution. Any insights will be helpful.
Thanks
Hey everyone,
I’ve created a SQL problem-solving course specifically designed to help people prepare for SQL interviews using a LeetCode study plan (50 questions from easy to hard). The course has already helped over 1000 students in just one week, and it’s received 5-star reviews. 🎉
I’m offering free access for a limited time. You can check it out here:
👉 SQL Problem Solving for Interviews
The course covers:
If you're prepping for a data analyst, developer, or tech interview, this might be useful. Also, if you like the course, I’d appreciate it if you left a review. 🙏
Thanks, and happy learning!
Hello everyone, Hope everyone is doing well. I am learning sql for analytics tasks for one project. when i downloaded the workbench, I downloaded it from the official website and the macos version, ARM version to be precise. The installation was successful, but everytime i open workbench, I keep getting a message that "few functions might not work since sql is not designed for your system".
Has anyone else faced this, ? Is there anything I need to do?
PS: I am using macbook pro, 2022 , with m2 chip
Hey everyone, I’ve recently started revising SQL since it’s a common requirement for all backend programming roles. I’m looking for a study partner with whom I can share my knowledge, and who’s also willing to exchange what they know. I believe learning with someone else is the best way to improve quickly and remember concepts for a longer time. Plus, when there’s another person involved, they can help catch any mistakes I might make while understanding SQL, which will only help us both. If you’re interested in practising SQL together, feel free to message me. We can use Discord or Google Meet to discuss and work on problems.
About me: M24 GMT+5:30. When replying or messaging, please tell me your region's timezone so I can save it, and accordingly, I can message. Also share your Discord ID so we can voice chat directly. Currently trying to switch to the IT sector. I have no problem connecting with people from different time zones or any age group!
If anyone is interested to learn SQL, DM me.. I can teach personally for a small fee
so i basically learnt that you need a matching column to join two tables
but while practising on a website it asks me to join two tables without any matching column
help pls
ON matches.winner = players.player_name
this is the code