/r/CodingHelp

Photograph via //r/CodingHelp

Welcome! Feel free to ask any questions regarding coding you have!

Our Rules

1. FLAIR YOUR POSTS! Don't put tags in post titles!

2. Do not ask us to do all the coding for you unless you have money to spend. (If you have got money to spend, make that clear and the amount in question).

3. Do not post spam and/or misleading titles.

4. Do not be abusive to other coders.

5. Please format code properly, or use a site such as Gist or Pastebin. If possible please provide a live example of your issue.

6. Do not downvote people because you think they asked a dumb question. Just because you think that someone has a dumb question, doesn't mean that it is dumb to them.

7. Do not have a misleading user flair. Keep them sensible, describing your level of coding ability and/or languages you know and/or your profession.

8. Please do not ask unethical questions, such as asking for homework to be written by someone else, or asking someone to copy another project directly.

9. Make sure to follow the Reddit Rules.


How to start coding:

Check our website https://codinghelp.site we have all the information you need there!


Related subreddits:


Suggest a post flair

If you have any suggestions for flairs (programming languages or generic coding topics) that we should add, please use the button below to message the mods with your suggestion.

If approved as a sensible flair for the community to use, it will be added to our bot for automated suggestions and to the flair list for everyone to use!

Anyone who abuses this by spamming mods will be banned.


Current supported flairs

  • HTML
  • CSS
  • Javascript
  • PHP
  • SQL
  • Ruby
  • Java
  • Python
  • C++
  • C#
  • C (Not in Bot)
  • Open Source
  • Other Code
  • Random
  • Meta

Flair colors

  • Green

Web Related Languages (Eg HTML, CSS)

 

  • Blue

App Related Languages (Eg Python, C#)

 

  • Red

Generic Coding Topics (Eg Open Source)

 

  • Yellow

Other Flairs (Eg Random, Meta)


/r/CodingHelp

79,936 Subscribers

1

RESEARCH

For someone like a senior programmer, professor, or anyone with expertise: what type of measurement tool should I use to determine if someone is a novice or veteran programmer?

1 Comment
2024/11/01
04:58 UTC

1

Frustrating macro scope issue

So I have a macro, changing the names to be shorter cause I’m on my phone:

#define MACRO(STR, …) { printf(“macro - “); printf((STR), ##VA_ARGS); }

Straight forward, it just prints a formatted string with variable number of args with the text “macro - “ before it. Problem is I’m trying to extract the name to make it more generic:

#define MACBASE(NAME, STR, …) { printf(“%s - “, (NAME)); printf((STR), ##VA_ARGS); }

#ifdef MACRO_PRINT #define MACRO(STR, …) MACBASE(“macro”, (STR), ##VA_ARGS) #endif

#undef MACBASE

For some reason, in the definition of MACRO, MACBASE is “out of scope”. I’ve been banging my head against this problem for hours and I have no idea what’s going on. Has anyone done macro stuff like this before?

0 Comments
2024/11/01
00:47 UTC

0

Python beginner

Best way to learn by self teaching ? Any recommendations would be appreciated

3 Comments
2024/11/01
00:34 UTC

1

Dockerfile to Run Chrome with Selenium?

I've tried a plethora of solutions from Stack overflow, YouTube tutorials, and ChatGPT debugging. However, none of these solutions have worked. When I deploy my script with this dockerfile, this is the error I get in return. Any help would be appreciated!

Dockerfile: FROM python:3.10-slim

ENV DEBIAN_FRONTEND=noninteractive

RUN apt-get update && apt-get install - y chromium-driver chromium fonts-liberation

libnss3

libx11-6

libatk-bridge2.0-0

libatspi2.0-0

libgtk-3-0

libxcomposite1

libxcursor1

libdamage1

libxrandr2

libgom-dev

&& apt-get clean && rm -rf /var/lib/apt/lists/*

ENV CHROME_BIN=/us/bin/chromium

CHROME_DRIVER_BIN=/us/bin/chromedriver

WORKDIR /app

COPY requirements.txt.

RUN pip install --no-cache-dir -r requirements.txt

COPY ..

EXPOSE 8080

Error:

File "/workspace/app/scraper.py", line 21, in driver = webdriver.Chrome(options=chrome_options) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/workspace/.heroku/python/lib/python3.12/site-packages/selenium/webdriver/chrome/webdriver.py", line 45, in init super().init( File "/workspace/.heroku/python/lib/python3.12/site-packages/selenium/webdriver/chromium/webdriver.py", line 55, in init self.service.start() File "/workspace/.heroku/python/lib/python3.12/site-packages/selenium/webdriver/common/service.py", line 105, in start self.assert_process_still_running() File "/workspace/.heroku/python/lib/python3.12/site-packages/selenium/webdriver/common/service.py", line 118, in assert_process_still_running raise WebDriverException(f"Service {self._path} unexpectedly exited. Status code was: {return_code}") selenium.common.exceptions.WebDriverException: Message: Service /workspace/.cache/selenium/chromedriver/linux64/130.0.6723.69/chromedriver unexpectedly exited. Status code was: 127

0 Comments
2024/10/31
21:53 UTC

1

Help choosing a cloud provider solution for personal projects

Hi there, I have been working on many personal projects lately and actually rekindling my love for creating things and programming in general after a couple of blank years.

I'm looking to deploy my applications to the cloud so I can actually have them available and mostly learn and tinker with them. The one I'm looking to deploy at the moment is composed of:

- A .NET API and SQL Server Express for backend shenanigans.
- A Vue3 application for the front.

But the idea is to actually have a place where I can more or less easily deploy my apps and test them, so I was leaning towards a VM where I could bundle everything together, but since I've never done the whole process for myself I'm getting lost at some points. For starters I'm looking for a VM where:

- Price is flat (X$/month or year), with no autoscale or surprise bills going on. I've trying the free trials of most of the big providers (Azure, AWS, GCloud...), and now I wanted something that is not a free tier, but it should be very cheap as this will basically be a sandbox for my learning.

- Preferably Windows as I will be running .NET stuff, but this is something minor.

I've been running these ideas on GPT and such but I would very much like to hear experiences from people who were on my position and moved up, some insights on actual pricings and use of services which would also fit my use case.

I was also wondering, if I run everything on one VM (which I know it's not ideal, but as far as I can see it's my easiest and cheaper option), how to set:

- The whole IP routing to host the app on the VM (static IP)
- CI/CD

I would appreciate any help, from any level, I'm just looking for any insight or knowledge that could be shared. Thanks a lot to anyone who took the time to read this.

1 Comment
2024/10/31
21:17 UTC

2

How does coding work?

Like how does code allow you to code anything you want? people use it program games, OS, AI, etc..

and the whole theory of coding languages coming from the lambda calculus, is that using it, it's possible to come up with literally ANYTHING a computer could EVER do, every game, ever voice assistant, every plot to take over the world, every single server for every single website. howww??? Like how does that emerge from a limited set of inputs in the first place?

like if you wanted to program a ball to roll down a hill, you cant say ball=roll.... There is no ball or roll in python, or java or anything, so what is there?

This is a pretty fundamental issue I've had with coding for a long time that's prevented me from getting into it.

How do bring your imagination to life with a limited number of inputs? how do you simulate anything with text?

10 Comments
2024/10/31
21:11 UTC

1

I have an .bat file in my old laptop, Not able to access it

So it says I don't have permission to access the item, the .bat file is hidden I can't even unhide it.

I've tried changing the owner, renaming the file to .txt, icacls, takeown command on CMD, Get-Item on Powershell.

Still I'm getting nowhere.

Please help!

1 Comment
2024/10/31
20:41 UTC

1

What lead you to a certain path in programming? Was it maths? Ebgineering etc?

Im just windering what git you guys to feel this is your "calling"? Was it by chance you just have interest in such things or was it videogames etc?

Ud love to hear your opinions, thankyou

5 Comments
2024/10/31
16:19 UTC

1

How I can store the embeddings into my chromadb?

I try to save the calculated embeddings into ChromaDb:

import os
import cohere
import time

from pypdf import PdfReader
from dotenv import load_dotenv
import chromadb

load_dotenv()

docsFolder='./docs'

def getTextFromPDF(fileName):
    text = ""
    reader = PdfReader(fileName)
    for page in reader.pages:
        text += page.extract_text() + "\n"
    return text

def getPhrases(docsFolder):
    phrases=[]

    with os.scandir(docsFolder) as it:
        for entry in it:
            if not entry.name.startswith('.') and entry.is_file():
                text=getTextFromPDF(docsFolder+"/"+entry.name)
                passages = [p.strip() for p in text.split("\n\n") if p.strip()]
                phrases.extend(passages)

    return phrases

start = time.perf_counter()
phrases = getPhrases(docsFolder)
end = time.perf_counter()

print("Passage Extraction time "+str(end-start)+" seconds")

co = cohere.ClientV2(api_key=os.getenv("COHERE_KEY"))

start = time.perf_counter()
res = co.embed(texts=phrases,model="embed-multilingual-v3.0", input_type="search_document",embedding_types=['float'])
end = time.perf_counter()

print("Embeddings generation time: "+str(end-start)+" seconds")

print(len(res.texts),len(res.embeddings.float),len(phrases))

# https://stackoverflow.com/a/79145093/4706711

client = chromadb.Client()
collection_name = "client_name_collection"
collection = client.create_collection(name=collection_name)


for i,phrase in enumerate(phrases):
   
    collection.add({
        # https://stackoverflow.com/a/79145093/4706711
        'embedding':res.embeddings.float[i],
        'matadatas':[{"phrase":phrase}]
    })

In oprder to run the script above I've run mt cromadb locally first:

chroma run

Then I run the script but I got:

Traceback (most recent call last):
  File "/mnt/job/Kwdikas/TechMate/chatbot/store_data.py", line 57, in <module>
    collection.add({
  File "/mnt/job/Kwdikas/TechMate/chatbot/venv/lib/python3.10/site-packages/chromadb/api/models/Collection.py", line 86, in add
    ) = self._validate_and_prepare_embedding_set(
  File "/mnt/job/Kwdikas/TechMate/chatbot/venv/lib/python3.10/site-packages/chromadb/api/models/CollectionCommon.py", line 272, in _validate_and_prepare_embedding_set
    ) = self._validate_embedding_set(
  File "/mnt/job/Kwdikas/TechMate/chatbot/venv/lib/python3.10/site-packages/chromadb/api/models/CollectionCommon.py", line 174, in _validate_embedding_set
    valid_ids = validate_ids(maybe_cast_one_to_many_ids(ids))
  File "/mnt/job/Kwdikas/TechMate/chatbot/venv/lib/python3.10/site-packages/chromadb/api/types.py", line 271, in validate_ids
    raise ValueError(f"Expected IDs to be a list, got {type(ids).__name__} as IDs")
ValueError: Expected IDs to be a list, got dict as IDs

So in my case what I consider as an ID??? Each item of res.embeddings.float represent an the embeding of each item in phrases.

So how I can store my calculated embeddings into chromadb? Can I save it all at once?

0 Comments
2024/10/31
15:25 UTC

1

SAS HELP

Hi everyone, I have one question about the following dataset, that has bookID, title, author, pubyear (publishing year), and isbn.

So I basically need to verify whether the isbn numbers were issued chronologically, using PROC SQL in SAS. So, is possible to have lower isbn than a previously published book. If anyone has any idea how to do, I’d really appreciate it! Thanks!

0 Comments
2024/10/31
14:41 UTC

1

How I can map my texts with the split passages?

I made this script:


import os
import cohere
import time

from pypdf import PdfReader
from dotenv import load_dotenv
from qdrant_client import QdrantClient
from qdrant_client.models import Distance, VectorParams

load_dotenv()

docsFolder='./docs'

def getTextFromPDF(fileName):
    text = ""
    reader = PdfReader(fileName)
    for page in reader.pages:
        text += page.extract_text() + "\n"
    return text

def getPhrases(docsFolder):
    phrases=[]

    with os.scandir(docsFolder) as it:
        for entry in it:
            if not entry.name.startswith('.') and entry.is_file():
                text=getTextFromPDF(docsFolder+"/"+entry.name)
                passages = [p.strip() for p in text.split("\n\n") if p.strip()]
                phrases.extend(passages)

    return phrases

start = time.perf_counter()
phrases = getPhrases(docsFolder)
end = time.perf_counter()

print("Passage Extraction time "+str(end-start)+" seconds")

co = cohere.ClientV2(api_key=os.getenv("COHERE_KEY"))

start = time.perf_counter()
res = co.embed(texts=phrases,model="embed-multilingual-v3.0", input_type="search_document",embedding_types=['float'])
end = time.perf_counter()

print("Embeddings generation time: "+str(end-start)+" seconds")

print(len(res.texts),len(res.embeddings.float),len(phrases))

What ti does is that extracts texts from PDF and calculates it embedings for each one. What I want to map the input texts with it embedding returned from cohere api and save into into the database.

The final goal is to store unto a db but I do not know wheter an embeding into res.emeddings.float[0] is the embedding of phrases[0]. Furthermore I want to store the embedings so I can use it in order to search for data of a given search query.

The way I will do it to calculate the embedding of the search query/searchterm and sort the incomming result with the minimum distance. Therefore I am looking a way to do this.

Have you used this API and how didi you store the embeddings for each text? Calling 1-1 seem impractical to me.

1 Comment
2024/10/31
13:05 UTC

4

Need help on...how to start learning coding

Can anyone help ...from where to start learn coding....like from which language should we start studying

16 Comments
2024/10/31
07:03 UTC

1

Need help with animation in css

.preamble{
    z-index: 1;
    display: inline-flex;
    flex-flow: column wrap;
    position: absolute;
    width: 45%;
    right:1%;
    bottom: -100%;
    opacity: 1;
    /* animation: rising_preamble, leaving_preamble;
    animation-fill-mode: forwards, forwards;
    animation-duration: 120s, 130s; 
    animation-iteration-count:  infinite, infinite; 
    animation-delay: 1s, 20s; */
    animation: rising_preamble 120s ease-in 0s infinite;
    animation: leaving_preamble 130s ease 10s infinite;
}



@keyframes rising_preamble{
    0% {
        bottom: -100%; opacity: 0;
      }

    4% {
        bottom: 10%;
        opacity: 1;
      }
    100% {
        bottom: 10%;
        opacity: 1;
      }
}

@keyframes leaving_preamble{
    0% {
        bottom: 10%;
        opacity: 1;
      }
    7.692% {
        bottom: 100%;
        opacity: 0;
    }
    40%{bottom: -100%; opacity: 0;}
    50%{bottom: -100%; opacity: 1;}

    100% {bottom: -100%; opacity: 1;}
}

why doesn't this work

1 Comment
2024/10/31
03:35 UTC

0

iNeedHelp

I know JavaScript decently well, I learned with Sololearn. I don’t have a laptop, only an iPad with a keyboard. Are there any good apps I can practice coding on?

1 Comment
2024/10/31
02:29 UTC

2

Particle filter assistance needed

I am currently trying to implement a partial filter in the ros, RVis and gazebo environment. Currently I have implemented a motion model and a sensor model to try to complete the assignment and am currently a little bit stuck. When I am running the code I am finding that I am able to load and calculate some particles for them to show up in the RVis environment. I know the motion model is working because as the robot moves forward 3 units all of the particles move forward 3 units but all in the direction they were randomly started in. I am having trouble making the particles change direction to try to locate the robot leading me to believe the sensor model is not working. Below is a link to most of my files for the project. The main one I am coding the particle filter in is the particle-filter-2.py file. If you don't mind taking a look at my code to help me fix this problem that would be amazing! Thanks in advance!

https://github.com/BennettSpitz51/particle-filter.git

1 Comment
2024/10/30
23:26 UTC

0

Need ideas

I've been programming for 8 months now and as I progress I'm starting to lose ideas on what to program. I did everything from Calculator to To Do app, Weather app etc... I want to start my own project but everytime I come up with something, there is already a better version of it. Are there any ideas that you guys have for me to program or collaborate on? I would really appreciate the advice.

4 Comments
2024/10/30
19:08 UTC

0

Struggling to make an executable out of a Python app

Hi,

I'm working on a Python application that utilizes mainly Streamlit and pandas. It's a decently straightforward application that send web requests for data, manipulates the output and shows the result to the user. At this point the app works fine, but I need to turn it into an executable so that it can be run on other devices without setting up the environment.

My first thought was PyInstaller, however it doesn't seem to work well with Streamlit. I've read through this thread on the topic and followed the tutorial on this GitHub repo but I haven't managed to get it working - the resulting .exe crashes with the warning:

"importlib.metadata.PackageNotFoundError: No package metadata was found for streamlit"

even though I've re-checked the hook-streamlit.py and also tried the fix from this post. At this point I'm guessing it just doesn't work on the versions of python/other libraries I'm using, but I fear that switching those versions around would be too much of a timesink with no guarantee of a fix.

I've also looked at cx_freeze, however the tutorials I've seen showcase a toy problem with one .py file. My application uses several modules that import from one another as needed, which I'm not sure is an issue. In any case I got stuck right at the start - the generated .exe file gives strange errors and crashes when executing the first file (main.py) and reading the first few import lines.

Does anyone here have experience turning a Streamlit app into an executable? It doesn't have to be through the previously mentioned tools, I'm down to try anything at this point.

Thank you for reading!

1 Comment
2024/10/30
19:06 UTC

0

Do IDEs with version history exist? Like similar to google docs?

I like how google docs has revision history so you can prove that your writing work wasn't done by AI. Is there a similar way of doing this for coding? A certain IDE that has this?

8 Comments
2024/10/30
18:39 UTC

1

Seeking Coding & AI help for a Revit Plugin to Streamline Warehouse Design

Hey everyone! I work at an architecture firm where we primarily design Tilt-Wall or PEMB (Pre-Engineered Metal Building) warehouses, typically with a spec office included. The planning process for these warehouses is highly repetitive—consistent grid spacing, standard tilt-wall panel lengths, etc. I think there’s an opportunity here to streamline this with a specialized Revit plugin, and I’m looking for someone skilled in coding and AI to help make it happen.

Here’s the vision:

  1. Warehouse Layout Automation: The plugin would allow you to select a warehouse type (cross-dock, front-loaded, or rear-loaded) and then specify details like the number of overhead doors and panel height. Based on these inputs, it would automatically generate the basic warehouse layout, saving time on the front end.

  2. Office Layout Suggestions: After defining the office area, you’d input the required number of offices, conference rooms, etc. The plugin would generate several office layout options based on common requirements (open office, private offices, restrooms, storage, break rooms).

While I know AI isn’t perfect, this plugin could help companies save significant time on layout design, letting us focus on detailed construction documents and facade design. Here’s the closest example I’ve found of what I envision: https://m.youtube.com/watch?v=sc6vpcsSA94

I think this could be a valuable tool for architecture firms and developers who want to streamline master planning and design costs. If anyone has experience in AI and coding and is interested in collaborating, please let me know!

3 Comments
2024/10/30
18:00 UTC

2

Can Someone help me with this NLTK error pls!!!

Hi, so i'm getting this error.

Exception has occurred: LookupError

**********************************************************************
  Resource punkt_tab not found.
  Please use the NLTK Downloader to obtain the resource:

  >>> import nltk
  >>> nltk.download('punkt_tab')
  
  For more information see: https://www.nltk.org/data.html

  Attempted to load tokenizers/punkt_tab/english/

  Searched in:
    - '/home/ec2-user/nltk_data'
    - '/usr/nltk_data'
    - '/usr/share/nltk_data'
    - '/usr/lib/nltk_data'
    - '/usr/share/nltk_data'
    - '/usr/local/share/nltk_data'
    - '/usr/lib/nltk_data'
    - '/usr/local/lib/nltk_data'
    - '/home/ec2-user/my_nltk_env/nltk_data'
    - '/home/ec2-user/nltk_data'
**********************************************************************



  File "/home/ec2-user/Masters Project/Email Proccessing.py", line 54, in tokenize_text
    return word_tokenize(text)
  File "/home/ec2-user/Masters Project/Email Proccessing.py", line 57, in <module>
    email_df['tokenized_text'] = email_df['cleaned_text'].apply(tokenize_text)
LookupError: 
**********************************************************************
  Resource punkt_tab not found.
  Please use the NLTK Downloader to obtain the resource:

  >>> import nltk
  >>> nltk.download('punkt_tab')
  
  For more information see: 

  Attempted to load tokenizers/punkt_tab/english/

  Searched in:
    - '/home/ec2-user/nltk_data'
    - '/usr/nltk_data'
    - '/usr/share/nltk_data'
    - '/usr/lib/nltk_data'
    - '/usr/share/nltk_data'
    - '/usr/local/share/nltk_data'
    - '/usr/lib/nltk_data'
    - '/usr/local/lib/nltk_data'
    - '/home/ec2-user/my_nltk_env/nltk_data'
    - '/home/ec2-user/nltk_data'
**********************************************************************

https://www.nltk.org/data.html

for this code given below.

import pandas as pd
import re
import nltk
nltk.data.path.append('/home/ec2-user/my_nltk_env/nltk_data')
from nltk.corpus import stopwords
from nltk.tokenize import word_tokenize
from nltk.stem import PorterStemmer, WordNetLemmatizer
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score, confusion_matrix

# Download necessary NLTK data
nltk.data.path.append('/home/ec2-user/nltk_data')  # Add the NLTK data path
nltk.download('punkt', download_dir='/home/ec2-user/nltk_data')
nltk.download('stopwords', download_dir='/home/ec2-user/nltk_data')
nltk.download('wordnet', download_dir='/home/ec2-user/nltk_data')

# Load the email dataset
email_file_path = "/home/ec2-user/Masters Project/test_set.csv"
email_df = pd.read_csv(email_file_path)

# Display the first few rows of the email dataset to verify loading
print("Sample data from email dataset:\n", email_df.head())

# Function to clean text
def clean_text(text):
    if isinstance(text, str):  # Check if the input is a string
        # Convert to lowercase
        text = text.lower()
        # Remove HTML tags
        text = re.sub(r'<.*?>', '', text)
        # Remove special characters and digits
        text = re.sub(r'[^a-zA-Z\s]', '', text)
        # Remove extra spaces
        text = re.sub(r'\s+', ' ', text).strip()
        return text
    else:
        return ""  # Return an empty string for non-text entries

# Concatenate relevant text fields (sender, receiver, subject, body) into a single text field
email_df['combined_text'] = (
    email_df['sender'].astype(str) + " " + 
    email_df['receiver'].astype(str) + " " + 
    email_df['subject'].astype(str) + " " + 
    email_df['body'].astype(str)
)

# Apply text cleaning for combined text
email_df['cleaned_text'] = email_df['combined_text'].apply(clean_text)

# Function to tokenize text
def tokenize_text(text):
    return word_tokenize(text)

# Apply tokenization
email_df['tokenized_text'] = email_df['cleaned_text'].apply(tokenize_text)

# Initialize stemmer and lemmatizer
stemmer = PorterStemmer()
lemmatizer = WordNetLemmatizer()

# Function to stem and lemmatize text
def stem_and_lemmatize(tokens):
    stemmed = [stemmer.stem(token) for token in tokens]
    lemmatized = [lemmatizer.lemmatize(token) for token in stemmed]
    return lemmatized

# Apply stemming and lemmatization
email_df['processed_text'] = email_df['tokenized_text'].apply(stem_and_lemmatize)

# Function to join tokens back into a single string for vectorization
def join_tokens(tokens):
    return ' '.join(tokens)

# Join tokens into a single string
email_df['final_text'] = email_df['processed_text'].apply(join_tokens)

# Vectorization using TF-IDF
vectorizer = TfidfVectorizer(max_features=5000)  # Adjust max_features based on your dataset size

# Apply TF-IDF vectorization to the processed text data
X_text = vectorizer.fit_transform(email_df['final_text'])

# Convert the TF-IDF matrix to a Data Frame 
tfidf_df = pd.DataFrame(X_text.toarray(), columns=vectorizer.get_feature_names_out())

# Display the first few rows of the TF-IDF data
print("TF-IDF data from email text:\n", tfidf_df.head())

# URL Preprocessing
url_file_path = "/home/ec2-user/Masters Project/fake_test_set.csv"
url_df = pd.read_csv(url_file_path)

# Check the first few rows to verify the URL dataset is loaded correctly
print("Sample data from URL dataset:\n", url_df.head())

# Check for an IP address in the URL
def has_ip(url):
    return 1 if re.search(r'[0-9]+(?:\.[0-9]+){3}', url) else 0

# Check for subdomains in the URL
def count_subdomains(url):
    return url.count('.') - 1

# Check for suspicious keywords in the URL
def has_suspicious_words(url):
    suspicious_words = ['login', 'verify', 'account', 'update', 'secure']
    return 1 if any(word in url.lower() for word in suspicious_words) else 0

# Extracting length of domain
def domain_length(url):
    domain = re.findall(r'://(www\.)?([A-Za-z_0-9.-]+).*', url)
    if domain:
        return len(domain[0][1])
    return 0

# Check if URL uses HTTPS or not
def is_https(url):
    return 1 if url.startswith('https') else 0

# Feature extraction for URL column
url_df['has_ip'] = url_df['url'].apply(has_ip)
url_df['num_subdomains'] = url_df['url'].apply(count_subdomains)
url_df['suspicious_words'] = url_df['url'].apply(has_suspicious_words)
url_df['domain_length'] = url_df['url'].apply(domain_length)
url_df['is_https'] = url_df['url'].apply(is_https)

# Display a few rows of the extracted URL features
print("Extracted URL features:\n", url_df[['url', 'has_ip', 'num_subdomains', 'suspicious_words', 'domain_length', 'is_https']].head())

# Select relevant URL features
url_features = url_df[['has_ip', 'num_subdomains', 'suspicious_words', 'domain_length', 'is_https']]

# Concatenate URL and email text features
combined_df = pd.concat([url_features.reset_index(drop=True), tfidf_df.reset_index(drop=True)], axis=1)

# Verify the combined dataset
print("Combined dataset with URL and Text features:\n", combined_df.head())

# Assuming both datasets contain the 'label' column (indicating phishing or legitimate)
y = url_df['label']  # The target labels (phishing or not)

# Split the combined dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(combined_df, y, test_size=0.2, random_state=42)

print("Training set size:", X_train.shape)
print("Testing set size:", X_test.shape)

# Train using Random Forest
rf_model = RandomForestClassifier(n_estimators=100, random_state=42)
rf_model.fit(X_train, y_train)

# Predictions of test data
y_pred = rf_model.predict(X_test)

# Evaluation of the model
accuracy = accuracy_score(y_test, y_pred)
precision = precision_score(y_test, y_pred, zero_division=1)
recall = recall_score(y_test, y_pred, zero_division=1)
f1 = f1_score(y_test, y_pred, zero_division=1)
conf_matrix = confusion_matrix(y_test, y_pred)

# Evaluation Results
print("\nModel Evaluation Metrics:")
print(f"Accuracy: {accuracy:.4f}")
print(f"Precision: {precision:.4f}")
print(f"Recall: {recall:.4f}")
print(f"F1 Score: {f1:.4f}")
print("Confusion Matrix:")
print(conf_matrix)


FYI, iam runing it on AWS EC-2. Please help me with this error
1 Comment
2024/10/30
17:34 UTC

1

Visual Studio

In Visual Studio Code, I am learning the C language. When I use scanf, I press enter after entering a value in the terminal, but instead of taking the value, it moves to the next line. How can I fix this to ensure that pressing enter submits the value?

4 Comments
2024/10/30
17:08 UTC

0

Game menu explanation

I need help understanding how a video game menu works in the programming sense, i want a better understanding of the coding that goes in to create a functioning menu

5 Comments
2024/10/30
13:33 UTC

1

hello!

hello! am 16, and very much interested in coding. i want to be an game game developer

am thinking, i should learn c+++,

could anyone give me suggestions and advice. on where i should start?

thank you so much!

19 Comments
2024/10/30
05:48 UTC

1

How to make a plugin/addon for Discord (like vencord and betterDiscord) Setup guide or code?

We are NOT trying to make a plugin for vencord or betterDiscord. We are trying to make an actual add on/plugin for Discord itself. Meaning users will go on the web and install our app, once installed the Discord app will have a different look to what it usually looks like.

We just want to know how we can achieve this, how do we use electron and js injection to alter the look of discord itself.

This isn't exactly a 1:1 problem, but even a bit of guidance would help.

Me and my developer are having a tricky time figuring out how we can make an add-on to Discord We're solely making a simple Discord plugin that could benefit user experience. It is not a client, or discord modification app similar to Vencord and BetterDiscord. It is just going to add a few little buttons to the actual Discord app (on macos and windows)

We know it's something to do with electron and js injecting but we really need a bit of a lift off as we dont know where to start when it comes to coding and actually testing our plugin on the Discord application.

How do we actually run and test our app, and make it work?

We've tried a few things, we've looked through the source code of betterDiscord and dmed developers but no one can help us.

We really just want to make a simple plugin to benefit user experience on discord.

0 Comments
2024/10/30
03:50 UTC

1

How hard is it to go from python to c++?

16 old here. I’m currently learning python but I kinda want to switch to c++ because I would like to try and make games in unity.

How difficult is it to switch to c++ to use in unity? What do you guys think?

4 Comments
2024/10/30
03:40 UTC

1

ursina engine help

I'm trying to add motion blur when any object or the camera moves. Is this possible in ursina, and if so how would I implement it?

0 Comments
2024/10/29
23:51 UTC

0

Might not be what I'm supposed to post but I'm worried

\downshotvr\binaries\win64\downshotvr-win64-shipping.exe is the thing im worried about, do any of you know if windows defender is being paranoid about this or?

0 Comments
2024/10/29
22:39 UTC

1

Overfitting or data preprocessing problem?

For a while I have been building an AI model to classify digits from brain scans. I have managed to train the model to '99% accuracy' but when I run it on a test dataset it fails. Could somebody with more knowledge and experience take a look at my code to see if it is an overfitting or a problem with my makeshift data processing and help would be very kind.

https://www.kaggle.com/code/scpcontainment/notebook69e313a46c

2 Comments
2024/10/29
18:57 UTC

0

The coding mindset.

I don't know much if this falls under stupid questions, but I'm having trouble with the mindset that comes with coding and problem-solving.

I understand the coding aspect; I can read it and write it; however, when it comes to solving problems and beginning from scratch, I can't wrap my head around it.

There like a block in my mind. I can't take my knowledge and create code to fix/solve a problem.

I'm wondering if there any resources out there to assist in this matter. I've been practicing for some time now and would appreciate any help

Thank you.

9 Comments
2024/10/29
14:22 UTC

1

Help

I am 13 years old, My older brother started coding around 14 and now is currently 19, and is very experienced and already has a well paying job for a company, I have always been interested in coding, 1 year ago I watched "Python for begginers" by Coding with Mosh. It was a great video, I watched half of the video and thh h en just lost interest because I'm not seeing real progress and I just didn't have the motivation. Recently me and my brother were talking and he told me that now is a great time to start because of how ChatGPT could be used to help me. Now I am currently dont know what to learn first which launguage? Thank you.

5 Comments
2024/10/29
12:07 UTC

Back To Top