/r/Python
If you have questions or are new to Python use r/LearnPython
News about the dynamic, interpreted, interactive, object-oriented, extensible programming language Python
You can find the rules here.
If you are about to ask a "how do I do this in python" question, please try r/learnpython, the Python discord, or the #python IRC channel on Libera.chat.
Please don't use URL shorteners. Reddit filters them out, so your post or comment will be lost.
Posts require flair. Please use the flair selector to choose your topic.
Posting code to this subreddit:
Add 4 extra spaces before each line of code
def fibonacci():
a, b = 0, 1
while True:
yield a
a, b = b, a + b
Online Resources
Five life jackets to throw to the new coder (things to do after getting a handle on python)
PyMotW: Python Module of the Week
Online exercices
programming challenges
Asking Questions
Try Python in your browser
Docs
Libraries
Related subreddits
Python jobs
Newsletters
Screencasts
/r/Python
Motivation : Last week, I posted about my project, Netfly: The Netflix Translator, here on r/python. I initially built it to solve a problem I ran into while traveling. Let me explain :
On a flight from New Delhi to Tokyo, I started watching an anime movie, The Concierge. The in-flight entertainment had English subtitles, and I was hooked, but I couldn’t finish it. Later, I found the movie on Netflix Japan, but it was only available with Japanese subtitles.
Here’s the problem: I don’t know enough Japanese (Nihongo wa sukoshi desu) to follow along, so I decided to build something that could fetch those Japanese subtitles, translate them into English, and overlay the translation on the video while retaining the Japanese subtitles which would give me better context.
What started as a personal project quickly became an obsession.
What does the Project Do ? : The primary goal of this project is simple: convert Japanese subtitles on Netflix into English subtitles in an automated way. This is particularly useful when English subtitles aren’t available for a title.
The Evolution of this Project / High Level Tech Solution : This is not the first iteration of Netfly. It has gone through two major updates based on feedback and my own learning.
Iteration 1: A Tech-Heavy but Costly Solution
How It Worked:
The Result: It worked, but it was far from practical. The cost of using Google Vision API for every frame made it unsustainable, and the whole process was painfully slow.
Iteration 2: Streamlining with Subtitles file
The Result: This was much better—cheaper, faster, and simpler. But there was still a manual step : downloading the subtitle file.
Iteration 3: Fully Automated Workflow
The Result: All Steps are completely automated now.
Target Audience : This project started as a personal tool, but it can be useful for:
Comparison with Other Similar Tools : Existing tools, like Chrome extensions, rely on pre-existing subtitles in the target language. For example, they can overlay English subtitles, but only if those subtitles are already available. Netfly is different because
To the best of my knowledge, no other tool automates this entire flow.
Working Demo / Screenshots :
https://imgur.com/a/vWxPCua
https://imgur.com/a/zsVkxhT
https://imgur.com/a/bWHRK5H
https://imgur.com/a/pJ6Pnoc
What's next : This is still a work in progress, but I feel it’s in a solid state now. Here’s what’s on my mind for the next steps:
Edge Cases: Testing on a broader range of Netflix titles to handle variations in subtitle formats.
Performance: Optimizing XML parsing and translation for faster processing.
Extensibility: Adding support for other subtitle languages.
Error Handling : Since i iterated very fast, I know the Error Handling is not upto the mark.
If this sounds interesting for you, the code is up on GitHub: https://github.com/Anubhav9/Netfly-subtitle-converter-xml-approach
I’d love to hear your thoughts , feedback and suggestions on this.
Cheers, and Thank you !
Hi potential bots,
I'm a Backend developer who works with Python and Flask. Also recently started using the IIS thingy to host our restful API backend on an in-premises Windows server. Demn! Nice intro I got.
So the issue** I want/need to host a power automate Application/desktop whatever that box code like software in blue is called. On a Windows server using IIS. And it should be running all the time. But VM might be locked after some time.
I also have a solution there that uses a watchdog to do some stuff after PA's processing is done (Excel creation automation task).
So sharks my ask would be, how the fruit I do the set-up of a power automate Application when I never worked on it? Please share detailed steps or else I might bite you.
Regards, Your BF
P.S.: I don't know a thing. Pls just 🍻 with me. Nor did I search for this on Bing 😏.
community but I believe more in peeps here.
Tldr; how to host a power automate desktop Application on a Windows server and keep it running forever.
Stumbled upon a useful Python resource? Or are you looking for a guide on a specific topic? Welcome to the Resource Request and Sharing thread!
Share the knowledge, enrich the community. Happy learning! 🌟
What My Project Does
From Adhami the author: I was wondering how 2048 would feel like if instead of powers of two, we can merge consequent fibonacci numbers. Turns out to be a rather interesting game that is fairly forgiving and grows very slowly. I found it difficult to come up with an overall strategy. I had a simple search algorithm that was able to achieve a score of exactly 66,666 (not joking). Getting a 987 block shouldn't be difficult.
You can take a look into the code here: https://github.com/adhami3310/987 (the simple search algorithm is inside the code as well)
Target Audience: Anyone
Comparison: Similar to 2048 but fib
Blog post: https://blog.pypi.org/posts/2024-11-14-pypi-now-supports-digital-attestations/
I'm angry that it got partially funded by the sovreign tech fund, when it's about "securing" uploads by giving the keys to huge USA companies. I think it's criminal they got public money for this.
I also don't think it adds any security whatsoever. It just moves the authentication from using credentials to PyPI to using credentials to github. They can be stolen in the exact same way.
edit: It got "GERMAN" public money.
I would like some user feedback
Github Link: https://github.com/DevER-M/yami
Pypi Link: https://pypi.org/project/yami-music-player/
Some of the features
Libraries used
Target audience
This project will be useful for people who do not want ads and want a simple user interface to play music
Comparison
There are currently no projects that have all the features covered and is made with tkinterTo use this install all requirements in the .txt file and you are good to go
RoadMap
I will update it now and then
A follow would be nice! https://github.com/DevER-M
so i wanted to do a face_recogntion attendence system but the heck , always error with this or dlib , for once it was not installing , and now it is installed it aint working proplerly , i tripled checked the code its the issue of this , , on linux it runs shockingly well , but nfortunately i have to use windows
I compiled a list of puzzles to improve Python. I hope this blog post serves as a humble guide for anyone interested in improving their Python by solving puzzles.
Hello Everyone A python Programmer here Just wondering if there is any kind of project / research work ideas which can be implemented in the field of space exploration/ technology cause I'm obsessed with space ;) Just give me suggestions Happy Coding ;)
Hey Python enthusiasts! Any VFX folks here? I've developed a little package called fxgui
- a collection of Python classes and utilities designed for building Qt-based UIs in VFX-focused DCC applications.
It's available on GitHub, PyPI, and comes with documentation. I'd love to hear your thoughts and get some feedback!
What it does:
dispatchery is a lightweight Python package for function dispatching inspired by the standard singledispatch decorator, but with support for complex, nested, parameterized types, like for example tuple[str, dict[str, int | float]].
Comparison:
Unlike singledispatch, dispatchery can dispatch based on:
Target Audience:
Python developers who don't like having a bunch of if isinstance checks everywhere in their code.
Example :
from dispatchery import dispatchery
@dispatchery
def my_func(value):
return "Standard stuff."
@my_func.register(list[str])
def _(value):
return "Strings!"
@my_func.register(list[int] | list[float])
def _(value):
return "Numbers!"
@my_func.register(str, int | float, option=str)
def _(value1, value2, option):
return "Two values and a kwarg!"
# my_func(42) or my_func("hello") will return "Standard stuff."
# my_func(["a", "b", "c"]) will return "Strings!"
# my_func([1, 2, 3]) or my_func([0.2, 0.5, 1.2]) will return "Numbers!"
# my_func("hello", 42, option="test") will return "Two values and a kwarg!"
Installation:
pip install dispatchery
See the full README on Github.
MIT license, feedback welcome!
When it comes to function overloading, those who have learned Java should be familiar with it. One of the most common uses is logging, where different overloaded functions are called for different parameters. So, how can we implement function overloading in Python? This post explains how. The Ultimate Guide to Implement Function Overloading in Python
I recently worked on a project combining my love for terminal limits and video art. Here’s what I achieved: • Rendered a 1-minute-long (almost two) ASCII video in the terminal, without graphics libraries or external frameworks. • Used true 24-bit colors for each frame, offering deeper color representation in terminal-based projects. • Processed 432 million characters over 228 seconds, translating each frame’s pixels to colors. • Optimized performance with multi-processing, running on an integrated graphics card.
Specs:
• 30 FPS
• 160,000+ characters per frame
• 2,700 frames
• 3 pixels per character for better performance
For further optimization, I reduced the font size to 3 pixels and used background colors to handle brightness.
What my project does? While not the most practical project, it’s an experiment I’m satisfied with it. No real use, but hey, it’s fun!
Target audience This is more of a fun project so I can't say it has a specific target audience, but I could say that people that strangely feels good coding "useless" things might like it.
Comparison
Well it is not an ASCII player anymore to be precise, but what it does now is just display video in the terminal using basically pure ANSI, I don't think there is an exact alternative to this since it doesn't serve a specific purpose, except from, well, displaying video with text, it is a fun project.
P.S. I’m considering rewriting the frame conversion in C to speed things up. More improvements are coming soon!
That’s it, you can watch a preview with Tank! from cowboy bebop (ignore some random color stripes i had to do some optimization but wasn’t really precise on difference calculation)
You can find the repo here
but be aware that the current version was not pushed to github yet, but feel free to analyze the old versions/commits if you feel like, I will update when I release the current code.
OBS: changefontsize.py only works with windows terminal, as it changes the default font from your profile, will be removed in the current version as it degrades compatibility. Removed in current version
Hello, I shared a Python Data Science Bootcamp on YouTube. Bootcamp is over 7 hours and there are 7 courses with 3 projects. Courses are Python, Pandas, Numpy, Matplotlib, Seaborn, Plotly and Scikit-learn. I am leaving the link below, have a great day!
Bootcamp: https://www.youtube.com/watch?v=6gDLcTcePhM
Data Science Courses Playlist: https://youtube.com/playlist?list=PLTsu3dft3CWiow7L7WrCd27ohlra_5PGH&si=6WUpVwXeAKEs4tB6
Anyway to bypass this with python and chrome?
Its not on the front page, but in the website itself.
The problem is when i manually click it, it gives still erorr?…
Welcome to Free Talk Friday on /r/Python! This is the place to discuss the r/Python community (meta discussions), Python news, projects, or anything else Python-related!
Let's keep the conversation going. Happy discussing! 🌟
Hello, I have 10,000 websites to assess for reCAPTCHA implementation and am looking for a more efficient solution. Currently, I'm using Selenium and ThreadPoolExecutor, which depend heavily on my computer's processing power. I can only iterate through 5 or 10 sites simultaneously to run a JavaScript script and determine if reCAPTCHA is present. This method takes approximately 10 hours with just 5 threads in Python. I need a better approach to expedite this process.
Hello, I don't know if already exists but I believe that would be great if there is a library that gives you the same API of pandas but uses Polars under the hood when possible.
I saw how powerful is Polars but still data scientists use a lot of pandas and it’s difficult to change habits. What do you think?
GitHub: SqueakyCleanText | PyPI: squeakycleantext
Happy to share SqueakyCleanText, a Python library designed to streamline text preprocessing for Natural Language Processing (NLP) and Machine Learning (ML) tasks. Whether you're working on language models, statistical ML pipelines, or any text-heavy application, this library aims to make your preprocessing pipeline more efficient and flexible.
Data Scientists, AI Engineers and Machine Learning Engineers dealing with text data.
NLP Researchers and NLP Linguists looking for customisable preprocessing tools.
Developers building applications that require text cleaning and anonymisation.
Advanced Named Entity Recognition (NER)
Ensemble of Models: Utilises multiple NER models from Hugging Face Transformers for improved accuracy.
Smart Text Chunking: Efficiently handles long texts by splitting them into optimized chunks.
Configurable Confidence Thresholds: Adjust the sensitivity of entity detection.
Configurable Models: Choose NER models which suits your use-case.
Configurable Positional Tags: Choose what you would like to be removed from the texts.
Automatic Language Detection: Supports English, German, Spanish, and Dutch with automatic model selection.
Modular Pipeline Architecture
Toggle-able Features: Easily enable or disable any step in the pipeline.
Single and Batch Processing: Consistent configuration applies to both modes.
Default Pipeline Includes:
Bad Unicode correction
HTML and URL handling
Contact information anonymization (emails, phone numbers)
Date and number normalization
Advanced NER processing
Whitespace and punctuation normalization
Performance Optimizations
Under-the-Hood NER Improvements: Enhanced NER processing delivers faster results without compromising accuracy.
Batch Processing Support: Process large datasets efficiently with configurable batch sizes.
Memory Management: Automatic cleanup of GPU memory to handle large-scale processing.
Comprehensive and Modular: Unlike libraries that focus on specific tasks, SqueakyCleanText offers a full suite of preprocessing steps that you can customize to your needs.
Advanced NER Integration: Combines multiple NER models and uses smart chunking to improve entity recognition in long texts.
Dual Output Formats: Provides both language model-formatted text and statistical model-formatted text in a single pass.
Easy Integration: Designed to seamlessly fit into existing workflows with minimal adjustments.
Installation
pip install SqueakyCleanText
Customizable Pipeline: Tailor the preprocessing steps to match your project's requirements by toggling features in config.py
.
Seamless NER Integration: Use the advanced NER processing to anonymize sensitive data or extract entities for downstream tasks.
Flexible Processing: Apply the same configurations to both single and batch processing modes without changing your code.
Efficient for Large Datasets: Leverage batch processing and memory optimizations to handle large volumes of text data.
The manager came to me from a sister team and asked me to produce the obscure Python code I could come up with. Because she wanted to give her developers a challenge. The requirements was that it should produce a code that could be sent in a text message to get the next challenge. And no you are not allowed to run it:) They solved in 30 minutes, can you solve it?
import inspect
def code_as_it_was_meant_to_be(tmp):
"""
www.lexico.com/definition/code
"A system of words, letters, figures, or symbols used to represent others,
especially for the purposes of secrecy."
Send what is printed out of by running this functionin a text message to xxx
"""
if len(set(tmp)) * 2 > len(tmp):
tmp = eval(inspect.stack()[1][4][0].replace(tmp, tmp + tmp[::-1]))
print(
"".join(
str(chr((ord(tmp[i * 2]) + ord(tmp[-(i + 1) * 2])) // 2))
for i in range(len(tmp) // 4)
)
)
else:
return tmp[::-1]
code_as_it_was_meant_to_be("d,W3b6`@")
What My Project Does:
This project automates the process of showcasing detailed analytics and visual insights of your Python repositories on your GitHub profile using GitHub Actions. Once set up, it gathers and updates key statistics on every push, appending the latest information to the bottom of your README without disrupting existing content. The visualizations are compiled into a gif, ensuring that your profile remains clean and visually engaging.
With this tool, you can automatically analyze, generate, and display visuals for the following metrics:
- Repository breakdown by commits and lines of Python code
- Heatmap of commit activity by day and time
- Word cloud of commit messages
- File type distribution across repositories
- Libraries used in each repository
- Construct counts (including loops, classes, control flow statements, async functions, etc.)
- Highlights of the most recent closed PRs and commits
By implementing these automated insights, your profile stays up-to-date with real-time data, giving visitors a dynamic view of your work without any manual effort.
---
Target Audience:
This tool is designed for Python developers and GitHub users who want to showcase their project activity, code structure, and commit history visually on their profile. It’s ideal for those who value continuous profile enhancement with minimal maintenance, making it useful for developers focused on building a robust GitHub presence or professionals looking to highlight their coding activity to potential collaborators or employers.
---
Comparison:
I havnt seen other tools like this, but by using GitHub Actions, this project ensures that new data is gathered and appended automatically, including in-depth insights such as commit activity heatmaps, word clouds, and code construct counts. This makes it more comprehensive and effortless to maintain than alternatives that require additional steps or only offer limited metrics.
Repo:
https://github.com/sockheadrps/PyProfileDataGen
Example:
https://github.com/sockheadrps
Youtube Tutorial:
Hello, fellow Python enthusiasts!
I am interested in exploring Python projects that can search for and identify the best flight options within a specified date range, such as a particular month like April 2024 or a broader range. This type of feature was once handled efficiently by services like Skyscnnr and I would love to find Python tools or open-source projects capable of similar functionality today.
If you know of any relevant resources, projects, or libraries, I’d greatly appreciate your suggestions!
Many thanks in advance for your input and help!
Welcome to this week's discussion on Python in the professional world! This is your spot to talk about job hunting, career growth, and educational resources in Python. Please note, this thread is not for recruitment.
Let's help each other grow in our careers and education. Happy discussing! 🌟
https://www.techspot.com/news/105557-pypim-new-method-execute-python-code-directly-ram.html
Performance can be significantly improved when the CPU is not involved
This is for a mathematics project that is due next Monday.
I am an undergraduate student in India majoring in mathematics. My professor asked me to present a mathematical solution in form of either a project or a paper.
Now I know I am not going to end up with a paper and I don't even have the time for that left.
The project was due next month but, you see now I need to do it all in a weekend.
My core interests are in data science and AI but I am quite open for projects in Business simulation, Optimization and Finance (professor's core subjects)
Project Ideas that I had ChatGPTed or figured out myself:
Performing a Network Analysis on Delhi Metro and finding the shortest routes using networkx (This is the one I was currently doing)
Deploying Trade strategies using Stochastic calculus and employing trade indicators on historical data (AKA technical analysis) (Abandoned project from last semester)
Creating a cli based Computer Algebra System/Mathematics language that takes up commands and gives back outputs:
calculus integrate y:=sin(x) with respect to x
plot y^2 == 4x```
I know the third one is silly because many advance tools exist and this will never be able to reach that level of complexity.
I need you all to figure out how I choose a project idea ...
Any other project idea is also welcomed (primarily from mathematics, data science, machine learning and Finance)
Over the past week, I have been developing an assembly-like interpreter for my custom language, which I call AXM. AXM is intended to resemble assembly language, but with a slightly more accessible syntax. Although the interpreter is currently written in Python and still in its early stages, it serves as a "toy" interpreter to test out language design concepts.
This project is primarily a toy rather than a production-ready tool. It’s not designed for practical applications but rather for exploration and learning. The syntax is heavily inspired by assembly languages but is simplified to make it a bit easier to work with. Anyone interested in language development or assembly-like languages might find it interesting to explore.
AXM is distinct from existing assembly languages because it focuses more on accessibility and is designed to be relatively simple, rather than optimized for performance or real-world use. Unlike traditional assembly, AXM is an interpreted language, allowing users to run code directly without needing to compile it. While there are other interpreters for assembly-inspired languages, AXM aims to balance simplicity with the principles of low-level programming, making it somewhat unique.
Any feedback is greatly appreciated! I’d love to hear thoughts on its potential and any suggestions for improvements.
https://github.com/KuriWasTaken/AXM
Edit: I know the code is very badly formatted and I should add more comments, I will fix this
https://flask.palletsprojects.com/en/stable/changes/#version-3-1-0
Hello r/Python!
Thought I'd share extractous, a new document extraction library that processes documents up to 20x faster than existing solutions.
What The Project Does
Extractous is a high-performance document extraction library that processes PDFs, Word documents, HTML, and many other formats with native speed. It's built with a Rust core and uses GraalVM to compile Tika components to native code, eliminating the need for external services or JVM runtime.
Performance
Extracted Apple's 10-K filing in 320ms vs unstructured-io's 8.2s
Average 18x faster across SEC filings dataset
Significantly lower memory footprint
Quick Start
pip install extractous
from extractous import Extractor
extractor = Extractor()
result = extractor.extract_file_to_string("document.pdf")
print(result)
Target Audience
Comparison
Features
Coming Soon
XHTML output support
Enhanced file metadata extraction
GIL-bypassing batch processing API for parallel workloads
Repo
https://github.com/yobix-ai/extractous
Try it online (free)
https://www.extractous.com/
Which is better, Visual Studio Code or Cycharm?
In terms of tools and ease of use, I currently use Cycharm, but I find it difficult to organise files.
uv is rapidly maturing as an open-source tool for Python project management, reaching a full-featured capabilities with recent versions 0.4.27 and 0.5.0, making it a strong alternative to Poetry, pyenv, and pipx. However, concerns exist over its long-term stability and licensing, given Astral's venture funding position.
https://open.substack.com/pub/martynassubonis/p/python-project-management-primer-a55