Github scraper app is used to scrape data for a specific user profile created using streamlit and BeautifulSoup python packages

Siva Prakash

Last update: Apr 5, 2022

Related tags

Web Crawling github-scraper-app

Overview

Github Scraper

Github scraper app is used to scrape data for a specific user profile.
Github scraper app gets a github profile name and check whether the given user name is exists or not.
If the user name exists, app will scrape the data from that github profile.
If the user name doesn't exists, app displays a info message.
You can download the scraped data in CSV,JSON and pandas profiling HTML report formats.

Installation :-

To install all necessary requirement packages for the app 👇

pip install -r requirements.txt

Packages Used :-

import requests
import pandas as pd
import streamlit as st
from bs4 import BeautifulSoup
from pandas_profiling import ProfileReport
from streamlit_pandas_profiling import st_profile_report

Function To Scrape the Data :-

def ScrapeData(user_name):
    url = "https://github.com/{}?tab=repositories".format(user_name)
    page = requests.get(url) 
    soup = BeautifulSoup(page.content, "html.parser")
    info = {"name": soup.find(class_="vcard-fullname").get_text()}
    info["image_url"] = soup.find(class_="avatar-user")["src"]
    info["followers"] = (
        soup.select_one("a[href*=followers]").get_text().strip().split("\n")[0]
    )
    info["following"] = (
        soup.select_one("a[href*=following]").get_text().strip().split("\n")[0]
    )

    try:
        info["location"] = soup.select_one("li[itemprop*=home]").get_text().strip()
    except:
        info["location"] = ""

    try:
        info["url"] = soup.select_one("li[itemprop*=url]").get_text().strip()
    except:
        info["url"] = ""

    repositories = soup.find_all(class_="source")
    repo_info = []
    for repo in repositories:
        try:
            name = repo.select_one("a[itemprop*=codeRepository]").get_text().strip()
            link = "https://github.com/{}/{}".format(user_name, name)
        except:
            name = ""
            link = ""
            
        try:
            updated = repo.find("relative-time").get_text()
        except:
            updated = ""

        try:
            language = repo.select_one("span[itemprop*=programmingLanguage]").get_text()
        except:
            language = ""

        try:
            description = repo.select_one("p[itemprop*=description]").get_text().strip()
        except:
            description = ""

        repo_info.append(
            {
                "name": name,
                "link": link,
                "updated ": updated,
                "language": language,
                "description": description,
            }
        )
    repo_info = pd.DataFrame(repo_info)
    return info, repo_info

Demo GIF Image 👇 :-

热搜榜-python爬虫+正则re+beautifulsoup+xpath

仓库简介微博热搜榜, 参数wb 百度热搜榜, 参数bd 360热点榜, 参数360 csdn热榜接口, 下方查看其他热搜待加入如何使用? 注册vercel fork到你的仓库, 右上角点击这里完成部署(一键部署) 请求参数 vercel配置好的地址+api?tit=+参数(仓库简介有参数信息

3 Jul 8, 2022

Script for scrape user data like "id,username,fullname,followers,tweets .. etc" by Twitter's search engine .

TwitterScraper Script for scrape user data like "id,username,fullname,followers,tweets .. etc" by Twitter's search engine . Screenshot Data Users Only

19 Nov 17, 2022

A database scraper created with mechanical soup and sqlite

WebscrapingDatabases a database scraper created with mechanical soup and sqlite author: Mariya Sha Watch on YouTube: This repository was created to su

30 Aug 8, 2022

This is a simple website crawler which asks for a website link from the user to crawl and find specific data from the given website address.

1 Jan 10, 2022

This is a web scraper, using Python framework Scrapy, built to extract data from the Deals of the Day section on Mercado Livre website.

Deals of the Day This is a web scraper, using the Python framework Scrapy, built to extract data such as price and product name from the Deals of the

1 Jan 12, 2022

Video Games Web Scraper is a project that crawls websites and APIs and extracts video game related data from their pages.

Video Games Web Scraper Video Games Web Scraper is a project that crawls websites and APIs and extracts video game related data from their pages. This

1 Jan 12, 2022

Github scraper app is used to scrape data for a specific user profile created using streamlit and BeautifulSoup python packages

Related tags

Overview

Github Scraper

Installation :-

Packages Used :-

Function To Scrape the Data :-

Demo GIF Image 👇 :-

You might also like...

热搜榜-python爬虫+正则re+beautifulsoup+xpath

Script for scrape user data like "id,username,fullname,followers,tweets .. etc" by Twitter's search engine .

A database scraper created with mechanical soup and sqlite

This is a simple website crawler which asks for a website link from the user to crawl and find specific data from the given website address.

This is a web scraper, using Python framework Scrapy, built to extract data from the Deals of the Day section on Mercado Livre website.

Automated data scraper for Thailand COVID-19 data

A tool to easily scrape youtube data using the Google API

An helper library to scrape data from TikTok in one line, using the Influencer Hunters APIs.

Video Games Web Scraper is a project that crawls websites and APIs and extracts video game related data from their pages.

Owner

Siva Prakash

Semplice scraper realizzato in Python tramite la libreria BeautifulSoup

This program will help you to properly scrape all data from a specific website

Web-scraping - A bot using Python with BeautifulSoup that scraps IRS website by form number and returns the results as json

Google Developer Profile Badge Scraper

Google Developer Profile Badge Scraper

A Python web scraper to scrape latest posts from official Coinbase's Blog.

Shopee Scraper - A web scraper in python that extract sales, price, avaliable stock, location and more of a given seller in Brazil

Simple Web scrapper Bot to scrap webpages using Requests, html5lib and Beautifulsoup.

This project was created using Python technology and flask tools to scrape a music site

A modern CSS selector implementation for BeautifulSoup