The mysterious jpg problem

The mysterious jpg problem in Reddit Image Grabber.

As mentioned in a previous post  there always seems to be some grabbed images that won’t display, Windows says they are corrupt, BS of course.

In the original source snippet that I used to build RIG around, it named or renamed every file it downloads to a .jpg, no matter what it was, doh!

So I got my hex editor out, and it appears that a lot of the non-displaying .jpgs are in fact .html files that play short GIF like videos (HTML5 I guess?)

So I’m going to have to ask the user to rename the bad ones to .html if they want to view them.

Region
The last .jpg will not be viewable, it’s an html 5 file.

Hmm, Could RIG detect and rename the files for the user?  Let’s not go down that road just yet.

See updated RIG Help page for further details on this and more.

The thought just occurred to me a few seconds ago that I like the idea of.

Quick Test Mode for RIG?

I’m thinking that a lot of potential users of RIG will be put off by having to sign up and get Reddit API passwords.  I’m thinking of a quick test mode where the user can download a max of 10 images on an account that I can set up for that purpose, the passwords would be hidden from the user and they would be limited to using test mode to once a day (somehow).  It could be a lot of work to implement, but it would improve the usability of the program a lot.

Let me think about it for a while.

In the mean time, here is the full source of RIG V0.40 for your pleasure and amusement.

Note that on start up, RIG now creates a folder on C:\ called “Reddit-Images” as a default download location.

RIG also creates four tiny .ini files for saving and loading settings, these files are: dest.ini, cid.ini, clid.ini and rig.ini.

I am actually working on V.043 at the moment, trapping user errors and some other small enhancements, but it needs a lot of testing before I publish it here, in a day or so maybe.

'''RIG V0.40 Download x amount of images from a subreddit'''
from tkinter import Tk, Button, E, END, Entry, filedialog, \
LabelFrame, Menu, messagebox, RIDGE, W
from tkinter.ttk import Combobox

import os
import urllib.request as web
import shutil
import subprocess
from time import sleep
import webbrowser
import pyautogui
import praw

ROOT = Tk()

#create default download folder
MY_DIR = ("C:\\Reddit-Images\\")
#check if MY_DIR exists
CHECK_FOLDER = os.path.isdir(MY_DIR)
#if it doesn't create it
if not CHECK_FOLDER:
    os.makedirs(MY_DIR)

#pylint rated +8.93 (1 global FOLDER_SELECTED )
# (event) unused argument x3 lines 96,101,106
#lots of var problems to fix, later

#___________________functions_____________________________

def get_images():
    '''connect to reddits api and get images'''

    #get users params from GUI entry boxes
    red_clid = E1.get()
    red_secret = E2.get()
    selected_subred = COMBO1.get()
    images_tograb = COMBO2.get()
    pause_length = COMBO4.get()
    get_cat = COMBO3.get()

    #call reddit api with login details and what we want to dl
    REDDIT = praw.Reddit(client_id=str(red_clid),
                         client_secret=str(red_secret),
                         user_agent='RIG V0.40')

    #basically because we cant change the praw category .method
    #with a variable,had to resort to pre-creating each outcome
    if get_cat == "hot":
        sub_reddit = REDDIT.subreddit(str(selected_subred)).hot \
        (limit=int(images_tograb))
    elif get_cat == "top":
        sub_reddit = REDDIT.subreddit(str(selected_subred)).top \
        ('all', limit=int(images_tograb))
    elif get_cat == "rising":
        sub_reddit = REDDIT.subreddit(str(selected_subred)).rising \
        (limit=int(images_tograb))
    elif get_cat == "gilded":
        sub_reddit = REDDIT.subreddit(str(selected_subred)).gilded \
        (limit=int(images_tograb))
    else:
        sub_reddit = REDDIT.subreddit(str(selected_subred)).new \
        (limit=int(images_tograb))

    #get users save folder
    d_p = (SDOF.get())
    d_p.replace("\\", "/")
    DIR_PATH = os.path.normpath(str(d_p))

    #open in explorer to view dls live as they come in
    dest_fold = SDOF.get()
    subprocess.Popen(["C:/Windows/explorer.exe", dest_fold.replace('/', '\\')])

    #the actual downloading bit
    for submissions in sub_reddit:
        #ignore stickies
        if not submissions.stickied:
            fullfilename = os.path.join(DIR_PATH, "{}.jpg".format(submissions))
            request = web.Request(submissions.url)
            #note the linesplit for easy reading
            with web.urlopen(request) as response, \
            open(fullfilename, 'wb') as out_file:
                shutil.copyfileobj(response, out_file)
                dir_count = len(os.listdir(DIR_PATH))
                #note the line split
                print("Downloaded. {} file(s) saved in '{}'." \
                .format(dir_count, DIR_PATH))
                #optional,but prob good idea if u dont want to get banned
                sleep(int(pause_length))

    messagebox.showinfo("Reddit Image Grabber", "Downloads completed")

def about_menu():
    '''disply about message box'''
    messagebox.showinfo("About", "RIG V0.40. Freeware. By Steve " \
    "Shambles 2018\nRIG allows you to download images in bulk " \
    "from Reddit.com")

def visit_blog():
    '''visit my blog'''
    webbrowser.open("https://stevepython.word_press.com/")

def onleftclick_cid(event):
    '''on right mouse click clear client id box and paste clipboard in it '''
    E1.delete(0, END)
    pyautogui.hotkey('ctrl', 'v')

def onleftclick_secret(event):
    '''on left click clear client secret box and paste clipboard in it '''
    E2.delete(0, END)
    pyautogui.hotkey('ctrl', 'v')

def onleftclick_getdest(event):
    '''get user selected place to save images'''
    DIR_PATH = filedialog.askdirectory()
    SDOF.delete(0, END)
    SDOF.insert(0, DIR_PATH)
    return DIR_PATH

def onrightclick_opendest(event):
    '''Display contents of folder in explorer'''
    dest_fold = SDOF.get()
    subprocess.Popen(["C:/Windows/explorer.exe", dest_fold.replace('/', '\\')])

def save_ids():
    '''save all the current settings to ini files'''
    red_clid = E1.get()
    red_secret = E2.get()
    dest_fold = SDOF.get()

    #save the right side options
    with open("cid.ini", 'w') as contents:
        contents.write(red_clid)
        print("cid written")

    with open("cls.ini", 'w') as contents:
        contents.write(red_secret)
        print("cls written")

    with open("dest.ini", 'w') as contents:
        contents.write(dest_fold)
        print("dest written")

    #now do athe left side options
    c_1 = COMBO1.get()
    c_2 = COMBO2.get()
    c_3 = COMBO3.get()
    c_4 = COMBO4.get()
    allcs = str(c_1)+"\n"+str(c_2)+"\n"+str(c_3)+"\n"+str(c_4)
    with open("rig.ini", 'w') as contents:
        contents.write(allcs)

def onrightclick_opendest(event):
    '''Display contents of folder in explorer'''
    dest_fold = SDOF.get()
    subprocess.Popen(["C:/Windows/explorer.exe", dest_fold.replace('/', '\\')])

def default_settings():
    '''reset all user settings to default'''
    with open("rig.ini", 'w') as contents:
        contents.write("Subreddit\nImages to grab\nCategory\nPauses")
    with open("rig.ini", 'r') as f:
        lines = [line.rstrip('\n') for line in open('rig.ini')]
        lines = f.read().splitlines()
        COMBO1.delete(0, END)
        COMBO1.insert(0, lines[0])
        COMBO2.delete(0, END)
        COMBO2.insert(1, lines[1])
        COMBO3.delete(0, END)
        COMBO3.insert(2, lines[2])
        COMBO4.delete(0, END)
        COMBO4.insert(3, lines[3])

        #reset right frame settings
        with open("cid.ini", 'w') as contents:
            contents.write("Enter Client ID")
            E1.delete(0, END)
            E1.insert(0, str("Enter Client ID"))

        with open("cls.ini", 'w') as contents:
            contents.write("Enter Client Secret")
            E2.delete(0, END)
            E2.insert(0, str("Enter Client Secret"))

        with open("dest.ini", 'w') as contents:
            contents.write("C:/Reddit-Images")
            SDOF.delete(0, END)
            SDOF.insert(0, str("C:/Reddit-Images"))

def rig_help():
    '''visit RIG help page on my blog'''
    webbrowser.open("https://stevepython.wordpress.com/2018/08/13/" \
    "reddit-image-grabber-help")

#_________________________Main_____________________________________

#create window
ROOT.title("RIG V0.40 Steve Shambles 2018")
ROOT.geometry("369x180")
ROOT.resizable(False, False)

#drop down menu
MENU_BAR = Menu(ROOT)
FILE_MENU = Menu(MENU_BAR, tearoff=0)
MENU_BAR.add_cascade(label="File", menu=FILE_MENU)
FILE_MENU.add_command(label="How to use RIG", command=rig_help)
FILE_MENU.add_command(label="Save settings", command=save_ids)
FILE_MENU.add_command(label="Reset settings", command=default_settings)
FILE_MENU.add_command(label="Visit Blog", command=visit_blog)
FILE_MENU.add_separator()
FILE_MENU.add_command(label="About", command=about_menu)
FILE_MENU.add_command(label="Exit", command=ROOT.destroy)
ROOT.config(menu=MENU_BAR)

#create frames

#labelframe for entering api details
FRAME1 = LabelFrame(ROOT, relief=RIDGE)
FRAME1.grid(padx=5, pady=8)

#labelframe for selction of subreddit, limit, pause, etc
FRAME0 = LabelFrame(ROOT, relief=RIDGE)
FRAME0.grid(row=0, column=1, padx=5, pady=8)

#api client id entry box
E1 = Entry(FRAME0, bd=3)
E1.grid(sticky=W+E, padx=5, pady=6)
E1.delete(0, END)
E1.insert(0, "Enter Client ID")
E1.focus() #set the cursor in this box
E1.bind("", onleftclick_cid)

#api client secret entry box
E2 = Entry(FRAME0, bd=3)
E2.grid(sticky=W+E, padx=5, pady=6)
E2.delete(0, END)
E2.insert(0, "Enter Client secret")
E2.focus()
E2.bind("", onleftclick_secret)

#select drive or folder entry box
SDOF = Entry(FRAME0, bd=3)
SDOF.grid(sticky=W+E, padx=5, pady=5)
SDOF.delete(0, END)
SDOF.insert(0, "Save folder")
SDOF.bind("", onleftclick_getdest)
SDOF.bind("", onrightclick_opendest)

#select subreddit
COMBO1 = Combobox(FRAME1)
COMBO1['values'] = ("Subreddit", "Art", "Cats", "celebrity", "Cringepics", \
"DogPictures", "EarthPorn", "ExpectationVsReality", "Funny", "FunnyAndSad", \
"HistoryPorn", "Images", "ImageComics", "Instagram", "Itookapicture", \
"Lol", "Memes", "Naturepics", "NoContextPics", "Pareidolia", \
"PerfectTiming", "Photographs", "Photoshopbattles", "Pics",  \
"PrettyGirls", "TheWayWeWere", "Tumblr", "Wallpapers", "Woahdude")
COMBO1.current(0) #set the selected item
COMBO1.grid(sticky=W+E, padx=5, pady=5)

#choose amount of images to download
COMBO2 = Combobox(FRAME1)
COMBO2['values'] = ("Images to grab", 1, 5, 10, 25, 50, 100, 250, 500, 999)
COMBO2.current(0)
COMBO2.grid(sticky=W+E, padx=5, pady=5)

#select topic combobox
COMBO3 = Combobox(FRAME1)
COMBO3['values'] = ("Category", "new", "hot", "top", "rising")
COMBO3.current(0)
COMBO3.grid(sticky=W+E, padx=5, pady=5)

#select download pause combobox
COMBO4 = Combobox(FRAME1)
COMBO4['values'] = ("Pauses in seconds", 1, 2, 3, 4, 5)
COMBO4.current(0)
COMBO4.grid(sticky=W+E, padx=5, pady=5)

#grab images button
B = Button(FRAME0, text="Grab Images", command=get_images)
B.grid(sticky=W+E, padx=5, pady=5)

#read ini files
#if cid.ini exists insert the code into GUI for user
if os.path.exists("cid.ini"):
    with open("cid.ini", 'r') as contents:
        GET_CID = contents.read()
        E1.delete(0, END)
        E1.insert(0, str(GET_CID))

#if not then create it
if not os.path.exists("cid.ini"):
    #default text otherwise
    with open("cid.ini", 'w') as contents:
        contents.write("Enter Client ID")
        E1.delete(0, END)
        E1.insert(0, str("Enter Client ID"))

#if cls.ini exists insert the code into box for user
if os.path.exists("cls.ini"):
    with open("cls.ini", 'r') as contents:
        GET_CLS = contents.read()
        E2.delete(0, END)
        E2.insert(0, str(GET_CLS))

#if not then create it
if not os.path.exists("cls.ini"):
    with open("cls.ini", 'w') as contents:
        contents.write("Enter Client Secret")
        E2.delete(0, END)
        E2.insert(0, str("Enter Client Secret"))

#if dest.ini exists insert the code into box for user
if os.path.exists("dest.ini"):
    with open("dest.ini", 'r') as contents:
        GET_DEST = contents.read()
        SDOF.delete(0, END)
        SDOF.insert(0, str(GET_DEST))
        DIR_PATH = (SDOF.get())

#if not then create it
if not os.path.exists("dest.ini"):
    with open("dest.ini", 'w') as contents:
        contents.write("C:/Reddit-Images")
        SDOF.delete(0, END)
        SDOF.insert(0, str("C:/Reddit-Images"))

#if rig.ini exists insert the codes into GUI for user
if os.path.exists("rig.ini"):
    with open("rig.ini", 'r') as f:
        lines = [line.rstrip('\n') for line in open('rig.ini')]
        lines = f.read().splitlines()
        COMBO1.delete(0, END)
        COMBO1.insert(0, lines[0])
        COMBO2.delete(0, END)
        COMBO2.insert(1, lines[1])
        COMBO3.delete(0, END)
        COMBO3.insert(2, lines[2])
        COMBO4.delete(0, END)
        COMBO4.insert(3, lines[3])

#if rig.ini not found call default settings function
if not os.path.exists("rig.ini"):
    default_settings()

ROOT.mainloop()
#to do
#error checks, essential
# help file in html on blog (work in progress)
# all in one ini

Using Python V3.5.6 on Windows 7.

Reddit Image Grabber Help

This is the help file for RIG (Reddit Image Grabber)

Before you can use RIG you ned to get authorization from the Reddit API.  It’s quite simple, only takes a moment, and you only need to do it once.

  1. Sign up for a Reddit account, unless you already have one.
  2. Navigate to “preferences” then click the “create App” tab..
  3. Click on “are you a developer? Create app

righelp2

You will now see this form appear at the bottom of the page. Complete the  form, similar to the image below.  It’s most important that you select the “script”option.

Reddit create app

Click on the “Create App” button.

The following should appear. The “personal use script” code and the “secret” code are what you need to paste into RIG. That’s it, you are good to go.

created-app-blur

Copy the “personal use script” code by highlighting it with the mouse, right click and select, “copy” from the drop-down menu.

Now run RIG, and paste the personal use script code in the first entry box on the right. Repeat this for the “secret code” and paste into the second entry box on the right.  (you only have to click inside the box once and it will clear the box and paste your code in automatically).

The codes in the example below are fake BTW.

rig-help-paste codes

Finally, click on the File menu and click on “Save settings” and you won’t have to re-enter the codes every time you run RIG.  Any settings changes you may have made to the drop-down boxes on the left  and the destination folder will also be saved.

Your first image grab

Click on the first drop down box on the left, “Subreddit” and choose a subreddit from those listed, or type your own preferred subreddit in by hand (careful with the spelling or it wont work).  I chose “funny” for this test.

Now onto the second drop down box, “Images to grab“.  Select your desired amount.  For a simple test, I’d go for 5.

NOTE: If you try for more than 1000 images in a day, Reddit is likely to ban you temporarily for the day.

Next is “Category“.  If you want to grab from the latest posts then select “new“, if you want the currently most popular posts select “hot“. Selecting “top” will get you posts that are doing well over time, not just the current hot one’s.  Finally Select “rising” for posts that are gaining votes quickly, and are on the up. I selected “new” for this test.

The last drop-down box is “Pauses“.  This literally tells RIG how many seconds to pause between getting each image.  If you set it at zero you are effectively DDOS attacking Reddit, so don’t do it, you’ll probably get banned and your account deleted in the worst case scenario.  I’m going with a 2 second pause here.

Now make sure that your Reddit codes are pasted into the top two entry boxes on the right side of RIG, and that they are correct, as discussed above.

Almost there.  Left click the mouse on the last entry box on the right side of RIG and a file selector will appear for you to select a folder to download the images into.

Alternatively, you can just use the default setting of “C:\reddit-images”, the folder has already been created on your drive by RIG, delete the folder if you don’t like that.

rig-tst-run-in-help

 

When you are ready,  go to the the file menu (top left) and click “Save settings“.  All the settings you have just made will be saved.  They will be automatically reloaded into RIG the next time you run it.

Now the bit you have been waiting for!

Click on  “Grab Images“.  The Grab button will usually stay pressed until the downloads are complete, (you may not even notice this if you are grabbing just a few images.)

Once RIG goes into download mode it will open Windows explorer so that you can see the images downloading as it happens, and you can  double click on them to view almost straight away, if  you want.  A pop up message will inform you when the downloads have been completed.

Tip: change Explorers view mode to “thumbnails” for a better experience.

At any other time you can right click (note: not left click) on the drive entry box and the place you chose to save your grabbed images will open up in Windows explorer again for your perusal.

My limitations

Because of the current limitations of my programming skills I do not know how to detect what format files are before downloading them.  So the assumption is that the file is a .jpg image, which in around 85% of cases it will be.

If a file appears not viewable in explorer, it is possible it is corrupted, but more likely it is a html5 file (.html) or a .bmp file, so please rename these by hand.

If you have no luck with those formats, then you could try .png and .gif.  In 90% of cases your file will be useable with one of those file extensions.

Another file type that pops up here and there (depending on the subreddits you are downloading from is .css, a style sheet connected with IMGUR.

At the end of the day, you could just delete the non displaying .jpgs, and live the rest of your life not knowing what you missed out on 🙂

Seriously though, I do apologise for this inconvenience.  As soon as my programming skills improve I hope to fix this.

Steve Shambles 2018 (Living up to his name)

 

My head is just not right today

Well it is Sunday.  I have had an easy minimal-programming weekend. I did a few hours of Python and the rest was  spent on my other “addictions”, and now I really wish I hadn’t.

My head is not right at this moment.  I have taken all afternoon just to add half of a simple feature to RIG,  and completely failed to implement an even easier feature, that I had already written in another of my programs!

What a donkey!

Everything I did was wrong.  I was (still am) sloppy and lazy, and frequently wanted to give up and go to bed, dammit.

I am so annoyed with myself, that I swear that until I feel proficient at Python;

I am not taking any full days off again if avoidable

I don’t know how long this state of mind will last. I am in the BiPolar range, and I have been diagnosed with ADD (Attention Deficit Disorder).

I have had it since I was a child. My son, as well as other relatives in my immediate family, also have it.

I do not take medication for the ADD any more, and this can sometimes open myself up to self-medicating, that’s what happened this weekend.

Get on with the programming stuff Steve!

OK, enough of this selfish, introspective gobbledegook.  what did I get done on RIG?

I got tired of pasting in the client_ID and client_secret into the GUI every time I ran it to test.

After a few hundred times of doing this I suddenly realised I could just store the passwords in a text file and auto load them into the GUI at start up. Doh, why didn’t I think of that days ago, so obvious isn’t it.

I saved a text file in the current RIG directory, with the three lines of:

  • Client_ID
  • Client_Secret
  • Dir_Path

Dir_Path is where the images are saved. I just used “c:\\temp\\” as the default.

So the text looked a bit like this:

8vM6fcnft5ed
juEOO4Ic4kgnjfhy
C:\\temp\\

However, I just couldn’t remember how to split the lines.  I’ve used it before in my other projects, but I just couldn’t be arsed to dig it out, or even Google it.  That is just so not me, honest.

So being in that lazy and sloppy mood,  I just  created three separate text files just to get the job done.

  • cid.ini
  • cls.ini
  • dest.ini

It’s a poor show, doing it that way, but it did work. 🙂

#if cid.ini exists insert the code into box for user
if os.path.exists("cid.ini"):
    with open("cid.ini",'r') as contents:
        get_cid=contents.read()
        E1.delete(0, END)
        E1.insert(0,str(get_cid)

I repeated the code for the other two .ini files.

But, what about if\when the user changes the codes or the save path in the GUI?
Obviously the .ini file would need to be updated.

No problemo, sounds easy enough, or so I thought.  I made the relevant changes, added some code, then boom!

I have variable problems all of a sudden. The save dir was writing a blank to the .ini, (there’s a naughty joke in there somewhere I think!)

So in desperation I resorted to the dreaded and hated (by the pro’s that is) Global variable, but then realised I did not actually know much about where to put it\them and you can be rest assured there will be conventions and rules on this.

Googled it, starting reading, started looking for excuses to get away from the screen, cup of tea, a smoke, chat with neighbour, run away from the foxes that live my garden etc.

fox-small

Finally, I thought of another feature I wanted for RIG, to be able to right click on the select folder box to open up the destination folder in windows explorer to look at the files that had just been downloaded.

I have done this before, both in my CBMan program and S-Py so no excuses, should only take a few minutes, right?  wrong!

So I decided to come here to write this post. 

After editing this text  several times, I must admit that I’m feeling a lot better.  I reckon I will be back to normal tomorrow.

By the way, I try to post once a day, just after midnight, as well as any additional posts I feel like writing that can come at any time, so it’s worth checking back after midnight most days, that’s GMT, I’m in the UK.

Now that I have humiliated myself to all three readers of this blog, maybe that will motivate me to fix these silly little things with RIG.  I’ll let you know.

Sorry for the rambling guys.  Be lucky.

Steve.

I’m using Python V3.5.6 on Windows 7.

RIG Improvents #2

Today is Saturday so I have allowed myself a little down-time for fun (don’t ask) , so I haven’t done much on Python today, probably about four hours, and not all of it working on my latest project RIG (Reddit Image Grabber).

I like to read Python related posts, tutorials, videos, books etc. whenever I can (only free one’s of course, I’m always broke.)

But I have done a little bit on RIG.

I have added another 10 or so subreddits to what will be the built in list, more to come, I hope to do 40 or 50 of them, as long as I can find ones that have mostly .jpg’s and have many thousands of subscribers .

I have decided to not include NSFW subreddits.  I will leave that choice up to the user. There are thousands of real porny subreddits and I don’t want to offend anyone. Just for the record, I personally have no problem with it.

In case you are interested in these things, here is a complete list of all subreddits, over 1 million of them! It’s a text file (open in WordPad or Word, I wouldn’t bother using Notepad, it will hang.)

As for the category problem that I was stuck on all day yesterday, after a bit of editing and lots of testing it looks like it works properly now, hooray!

If you read yesterdays update, you will know the day ended with the Reddit API blocking me, LOL. I retried after midnight, and sure enough I was unblocked, there is definitely a 1000 image limit a day.

Another huge problem is coming up, that I will have to deal with soon.  A lot of the downloaded images are not viewable.

This is either because of download corruption, or  the code doesn’t discern between a text post, image, video or a Gif.  This is starting t worry me.  I hope it’s fixable as it will make RIG a little crappier than it deserves to be. I will have to go rename download images that won’t view and try find out what they actually are (I’ll have to dust off the Hex editor)

Another gripe I have with the original bit of code I started with,  is they way that it counts downloads by looking at what’s already in the directory, a bit naff that.  I should be able to solve that one though.

Two things in the code are baffling me a little at the moment.  I’ve not researched into them as yet, the first is, what does the ‘all’ do in this bit of code?

sub_reddit = REDDIT.subreddit(str(selected_subred)).top \
        ('all', limit=int(images_tograb))

If you look at the similar lines around it, none of them have the “all” bit in them, weird.

The other little mystery is a function I use when called by a mouse-click event on the GUI.

def onleftclick_secret(event):
    '''on left click, clear client secret box and paste clipboard into it '''
    E2.delete(0, END)
    pyautogui.hotkey('ctrl', 'v')

Now, when I run Pylint to check the code it reports that the (Event) is an unused variable, but if I take out the word “event” and just leave () as usual the function doesn’t get called, weirder city or what.

I hope I will have all the answers to these questions by tomorrow, but that is pushing it as it’s Sunday and I want  to have, just a little bit more fun before the weekend is out.

Following is the latest code in full for RIG.

'''RIG V0.25 Download x amount of images from a subreddit'''
from tkinter import Tk, Button, E, END, Entry, filedialog, \
LabelFrame, Menu, messagebox, RIDGE, W
from tkinter.ttk import Combobox

import os
import urllib.request as web
import shutil
from time import sleep
import webbrowser
import pyautogui
import praw

ROOT = Tk()

#pylint rated +9.55 (2 global vars used,FOLDER_SELECTD and REDDIT)

#functions
def get_images():
    '''connect to reddits api and get images'''
    global REDDIT #sorry!

    #get users parmas from GUI entry boxes
    red_clid = E1.get()
    red_secret = E2.get()
    selected_subred = COMBO1.get()
    images_tograb = COMBO2.get()
    pause_length = COMBO4.get()
    get_cat = COMBO3.get()

    #test check, remove later
    print(selected_subred)
    print(get_cat)
    print(images_tograb)
    print(pause_length)

    REDDIT = praw.Reddit(client_id=str(red_clid),
                         client_secret=str(red_secret),
                         user_agent='RIG V0.25')

    #basically because we cant change the praw .method with a
    #variable,had to resort to pre-creating each outcome,
    #and choosing the correct one.
    if get_cat == "hot":
        sub_reddit = REDDIT.subreddit(str(selected_subred)).hot \
        (limit=int(images_tograb))
    elif get_cat == "top":
        sub_reddit = REDDIT.subreddit(str(selected_subred)).top \
        ('all', limit=int(images_tograb))
    elif get_cat == "rising":
        sub_reddit = REDDIT.subreddit(str(selected_subred)).rising \
        (limit=int(images_tograb))
    elif get_cat == "gilded":
        sub_reddit = REDDIT.subreddit(str(selected_subred)).gilded \
        (limit=int(images_tograb))
    else:
        sub_reddit = REDDIT.subreddit(str(selected_subred)).new \
        (limit=int(images_tograb))

    #line below worked but without get_cat,
    #the above IF block does that now:
    #sub_reddit = REDDIT.subreddit(str(selected_subred)).new \
    #(limit=int(images_tograb))

    #get users save folder
    dir_path = (SDOF.get())

    for submissions in sub_reddit:
        #ignore stickies
        if not submissions.stickied:
            fullfilename = os.path.join(dir_path, "{}.jpg".format(submissions))
            request = web.Request(submissions.url)
            #note the linesplit
            with web.urlopen(request) as response, \
            open(fullfilename, 'wb') as out_file:
                shutil.copyfileobj(response, out_file)
                dir_count = len(os.listdir(dir_path))
                print(fullfilename)
                #note the line split
                print("Downloaded. {} file(s) saved in '{}'." \
                .format(dir_count, dir_path))

                #optional,but prob good idea if u dont want to get banned
                sleep(int(pause_length))

def about_menu():
    '''disply about message box'''
    messagebox.showinfo("About", "RIG V0.25. Freeware. By Steve Shambles 2018" \
    "\nRIG allows you to download images in bulk from Reddit.com")

def visit_blog():
    '''visit my blog'''
    webbrowser.open("https://stevepython.wordpress.com/")

def onleftclick_cid(event):
    '''on right mouse click clear client id box and paste clipboard in it '''
    E1.delete(0, END)
    pyautogui.hotkey('ctrl', 'v')

def onleftclick_secret(event):
    '''on left click clear client secret box and paste clipboard in it '''
    E2.delete(0, END)
    pyautogui.hotkey('ctrl', 'v')

def onleftclick_getdest(event):
    '''get user selected place to save images'''
    global FOLDER_SELECTED #I know this is bad but had no choice
    FOLDER_SELECTED = filedialog.askdirectory()
    SDOF.delete(0, END)
    SDOF.insert(0, FOLDER_SELECTED)
    return FOLDER_SELECTED

#main

#create window
ROOT.title("RIG V0.25 Steve Shambles 2018")
ROOT.geometry("369x180")
ROOT.resizable(False, False)

#drop down menu
MENU_BAR = Menu(ROOT)
FILE_MENU = Menu(MENU_BAR, tearoff=0)
FILE_MENU.add_command(label="About", command=about_menu)
FILE_MENU.add_command(label="Visit Blog", command=visit_blog)
FILE_MENU.add_separator()
MENU_BAR.add_cascade(label="File", menu=FILE_MENU)
FILE_MENU.add_command(label="Exit", command=ROOT.destroy)
ROOT.config(menu=MENU_BAR)

#create frames

#FRAME0 labelframe for entering api details
FRAME1 = LabelFrame(ROOT, relief=RIDGE)
FRAME1.grid(padx=5, pady=8)

#FRAME1 labelframe for selctio of subreddit\limit, topic
FRAME0 = LabelFrame(ROOT, relief=RIDGE)
FRAME0.grid(row=0, column=1, padx=5, pady=8)

#api client id entry box
E1 = Entry(FRAME0, bd=3)
E1.grid(sticky=W+E, padx=5, pady=6)
E1.delete(0, END)
E1.insert(0, "Enter Client ID")
E1.focus() #set the cursor in this box
E1.bind("", onleftclick_cid)

#api client secret entry box
E2 = Entry(FRAME0, bd=3)
E2.grid(sticky=W+E, padx=5, pady=6)
E2.delete(0, END)
E2.insert(0, "Enter Client secret")
E2.focus() #set the cursor in this box
E2.bind("", onleftclick_secret)

#select drive or folder entry box
SDOF = Entry(FRAME0, bd=3)
SDOF.grid(sticky=W+E, padx=5, pady=5)
SDOF.delete(0, END)
SDOF.insert(0, "Save destination")
SDOF.bind("", onleftclick_getdest)

#select subreddit
COMBO1 = Combobox(FRAME1)
COMBO1['values'] = ("Subreddit", "Art","Cats", "celebrity", "Cringepics", \
"DogPictures", "EarthPorn", "ExpectationVsReality", "Funny", "FunnyAndSad", \
"HistoryPorn", "Images", "ImageComics", "Instagram", "Itookapicture", \
"Lol", "Memes", "Naturepics", "NoContextPics", "Pareidolia", \
"PerfectTiming", "Photographs", "Photoshopbattles", "Pics",  \
"PrettyGirls", "TheWayWeWere", "Tumblr", "Wallpapers", "Woahdude")

COMBO1.current(0) #set the selected item
COMBO1.grid(sticky=W+E, padx=5, pady=5)

#choose amount of images to download
COMBO2 = Combobox(FRAME1)
COMBO2['values'] = ("no. of Images to grab", 1, 5, 10, 25, 50, 100, 250, 500, 999)
COMBO2.current(0) #set the selected item
COMBO2.grid(sticky=W+E, padx=5, pady=5)

#select topic combobox
COMBO3 = Combobox(FRAME1)
COMBO3['values'] = ("new", "hot", "top", "rising", "gilded")
COMBO3.current(0) #set the selected item
COMBO3.grid(sticky=W+E, padx=5, pady=5)

#select download pause combobox
COMBO4 = Combobox(FRAME1)
COMBO4['values'] = ("Pauses in seconds", 1, 2, 3, 4, 5)
COMBO4.current(0) #set the selected item
COMBO4.grid(sticky=W+E, padx=5, pady=5)

#grab images button
B = Button(FRAME0, text="Grab Images", command=get_images)
B.grid(sticky=W+E, padx=5, pady=5)

ROOT.mainloop()

#a full list of over 1 million subreddits here:
#https://drive.google.com/file/d/1MckgVte16DzxfXz7b0LptjHAaocyx9DO/view

Be seeing you soon.
Have a good weekend, Steve.

I’m using Python V3.5.6 on Windows 7 desktop.

Next project idea

Whilst nosing around Python forums etc. I came across an interesting bit of code to scrape Craigslist of certain data.

craigslist

I like the simplicity of it, and I’m considering using it as a starting point for my next project.

'''This will give you URLs on craigslist that accept crypto as a payment'''
import requests,bs4
pg=0
num=0
while (pg<span id="mce_SELREST_start" style="overflow:hidden;line-height:0;"></span>&lt;3000):
    url=requests.get(&quot;https://sfbay.craigslist.org/search/sss?s{page}&quot; \
    &quot;&amp;crypto_currency_ok=1&quot;.format(page=pg))
    soup=bs4.BeautifulSoup(url.text, &quot;html.parser&quot;)
    products=soup.find_all(&quot;li&quot;, attrs={&quot;class&quot;: &quot;result-row&quot;})
    for i in products:
        new_url=i.find(&quot;a&quot;)
        print(new_url.get(&quot;href&quot;))
        num+=1
    pg+=120
print(&quot;\n\nThere were total of {total} listings that accept crypto&quot; \
      .format(total=num))

Doesn't look too bad, huh!

I 'm thinking about changing it to scrape a different site (or sites) for interesting data and display it all in an easy to use GUI, of course.  only programmers don't mind CLI based utilities, the general public are terrified of them LOL.

I'm thinking maybe scrape football results  or the league tables or player data, not sure, but I have a high interest in football (soccer ) I am a big Tottenham Hotspur follower and have been watching them most of the last 35 years.

I’m a bit late to the show though as we just had the World Cup and a new Premier League season started yesterday, I think 4 years should be long enough to knock up a scraper though, don’t you?

COYS

Steve.

 

Reddit Imager Grabber update

Yet another Reddit Imager Grabber update 🙂

After 4 hours sleep, I got back in the saddle, as I really wanted to finish my Reddit Image Grabber app as quickly as possible.

Everything was moving swiftly on.  I changed the GUI layout, as I mentioned in my last update and started binding the values from the GUI to the working code.

I did it all with no hiccups, until I hit the, what I call the “category” option.

It just wouldn’t take a variable of any kind.  I was told it is because it is a method of the Reddit API.

So, after a couple of hours of super fast coding, it all came to a sudden standstill at the last hurdle, damn it, I was pretty peeved I can tell you.

I did the usual reading of the docs,  I tried the Praw docs and the Reddit API docs  unfortunately there was nothing in them that made much sense to a dimwitted beginner such a myself.

After I had wasted the whole afternoon on this one little problem, I gave in to asking for help on (ironically) Reddit. Check out the post.

I won’t bother repeating it all here, you can read the post if you are interested.

In the end (after a little afternoon snooze) I got an answer that partly worked.  There are little glitches that are too boring to mention here, so I decided to disable that feature for now and get on with the rest of the coding.

It’s frustrating, but time vs result led me to this conclusion,  for now.  Rest assured I’ll come back to that feature soon enough.

So, here is what the new GUI looks like now.

Rig21 gui-startup

Yup, I forgot to plan for a save destination for the users downloaded images.  Luckily as I was able to dispense with “user name” and “password” entry boxes it all fitted in nicely once I moved the “Grab Images” button to the right frame.

Rig21 gui-in use

In the first drop down box I have added about 15 busy subreddits that are full of images for the user to choose from.  I will add a lot  more before we are finished.  Also, the user can type his\her own preferred subreddits in manually of course.

The second drop down box is,  “Number of images to grab”.

The Reddit API says a max of 1000 a day, with a 2 second pause in between each downloaded image,  so I have limited it to 999, but of course, the user could just type in their own amount or restart to do another 999, but let’s think about these sort of details once the code is all working properly .

I have also slotted in my standard file menu, and that all works.

If you are going to run the code below then remember it is a bit fragile at the moment, i.e there are virtually no error checks put in place as yet.

Here is the currently working source code.

'''RIG V0.21 Download images from a subreddit'''

from tkinter import Tk, Button, E, END, Entry, filedialog, \
LabelFrame, Menu, messagebox, RIDGE, W
from tkinter.ttk import Combobox

import os
import urllib.request as web
import shutil
from time import sleep
import webbrowser
import pyautogui
import praw

ROOT = Tk()

#pylint rated +9.80
#(2 global vars used,FOLDER_SELECTED and REDDIT)

#functions
def get_images():
    '''connect to reddits api'''
    global REDDIT

    #get client id from entry box
    red_clid = E1.get()
    red_secret = E2.get()
    selected_subred = COMBO1.get()
    images_tograb = COMBO2.get()
    pause_length = COMBO4.get()

    #NOTICE the capital "R" in 'praw.Reddit',
    #I didn't for ages, and this wouldn't work
    REDDIT = praw.Reddit(client_id=str(red_clid),
                         client_secret=str(red_secret),
                         user_agent='RIG V0.21')

    #dowload images from the subreddit
    #u can change '.new' to one of .hot .new .rising
    #.gilded or .controversial and others
    #I haven't been able to get it to work
    #properly yet in the following line below.
    sub_reddit = REDDIT.subreddit(str(selected_subred)).new(limit=int(images_tograb))

    #get users dest folder
    dir_path = (SDOF.get())
    print(dir_path)

    for submissions in sub_reddit:
        #ignore stickies
        if not submissions.stickied:
            fullfilename = os.path.join(dir_path, "{}.jpg".format(submissions))
            request = web.Request(submissions.url)
            #note the linesplit
            with web.urlopen(request) as response, \
            open(fullfilename, 'wb') as out_file:
                shutil.copyfileobj(response, out_file)
                dir_count = len(os.listdir(dir_path))
                print(fullfilename)
                #note the line split
                print("Downloaded. {} file(s) saved in '{}'." \
                .format(dir_count, dir_path))

                #prob good idea if u don't want to get banned
                sleep(int(pause_length))

def about_menu():
    '''disply about message box'''
    messagebox.showinfo("About", "RIG V0.21. Freeware. By Steve Shambles 2018" \
    "\nRIG allows you to download images in bulk from Reddit.com")

def visit_blog():
    '''visit my blog'''
    webbrowser.open("https://stevepython.wordpress.com/")

def onrightclick_cid(event):
    '''on right mouse click clear client id box and paste clipboard in it '''
    E1.delete(0, END)
    pyautogui.hotkey('ctrl', 'v')

def onrightclick_secret(event):
    '''on right click clear client secret box and paste clipboard in it '''
    E2.delete(0, END)
    pyautogui.hotkey('ctrl', 'v')

def onleftclick_getdest(event):
    '''get user selected place to save images'''
    global FOLDER_SELECTED #I know this is bad but had no choice
    FOLDER_SELECTED = filedialog.askdirectory()
    SDOF.delete(0, END)
    SDOF.insert(0, FOLDER_SELECTED)
    return FOLDER_SELECTED

#main

#create window
ROOT.title("RIG V0.21 Steve Shambles 2018")
ROOT.geometry("369x180")
ROOT.resizable(False, False)

#drop down menu
MENU_BAR = Menu(ROOT)
FILE_MENU = Menu(MENU_BAR, tearoff=0)
FILE_MENU.add_command(label="About", command=about_menu)
FILE_MENU.add_command(label="Visit Blog", command=visit_blog)
FILE_MENU.add_separator()
MENU_BAR.add_cascade(label="File", menu=FILE_MENU)
FILE_MENU.add_command(label="Exit", command=ROOT.destroy)
ROOT.config(menu=MENU_BAR)

#create frames

#FRAME0 labelframe for entering api details
FRAME1 = LabelFrame(ROOT, relief=RIDGE)
FRAME1.grid(padx=5, pady=8)

#FRAME1 labelframe for selctio of subreddit\limit, topic
FRAME0 = LabelFrame(ROOT, relief=RIDGE)
FRAME0.grid(row=0, column=1, padx=5, pady=8)

#api client id entry box
E1 = Entry(FRAME0, bd=3)
E1.grid(sticky=W+E, padx=5, pady=6)
E1.delete(0, END)
E1.insert(0, "Enter Client ID")
E1.focus() #set the cursor in this box
E1.bind("", onrightclick_cid)

#api client secret entry box
E2 = Entry(FRAME0, bd=3)
E2.grid(sticky=W+E, padx=5, pady=6)
E2.delete(0, END)
E2.insert(0, "Enter Client secret")
E2.focus() #set the cursor in this box
E2.bind("", onrightclick_secret)

#select drive or folder entry box
SDOF = Entry(FRAME0, bd=3)
SDOF.grid(sticky=W+E, padx=5, pady=5)
SDOF.delete(0, END)
SDOF.insert(0, "Save destination")
SDOF.bind("", onleftclick_getdest)

#select subreddit
COMBO1 = Combobox(FRAME1)
COMBO1['values'] = ("Subreddit", "Art", "Cringepics", "EarthPorn", "Gifs",  \
"HistoryPorn", "Images", "Itookapicture", "Memes", "Naturepics",  \
"photographs", "Photoshopbattles", "Pics", "PrettyGirls", "Reactiongifs",  \
"Wallpapers", "Woahdude")
COMBO1.current(0) #set the selected item
COMBO1.grid(sticky=W+E, padx=5, pady=5)

#choose amount of images to download
COMBO2 = Combobox(FRAME1)
COMBO2['values'] = ("no. of Images to grab", 1, 5, 10, 25, 50, 100, 250, 500, 999)
COMBO2.current(0) #set the selected item
COMBO2.grid(sticky=W+E, padx=5, pady=5)

#select topic combobox
COMBO3 = Combobox(FRAME1)
COMBO3['values'] = ("New")
COMBO3.current(0) #set the selected item
COMBO3.grid(sticky=W+E, padx=5, pady=5)

#select download pause combobox
COMBO4 = Combobox(FRAME1)
COMBO4['values'] = ("Pauses in seconds", 1, 2, 3, 4, 5)
COMBO4.current(0) #set the selected item
COMBO4.grid(sticky=W+E, padx=5, pady=5)

#grab images button
B = Button(FRAME0, text="Grab Images", command=get_images)
B.grid(sticky=W+E, padx=5, pady=5)

ROOT.mainloop()

I have just retested this code for the 100th time today and can you guess what? I got banned LOL.

I suppose I must have downloaded a crap load of images during tests, maybe over the 1000 limit?  I’ll have to reset the API and get new client-id etc.

Well guys and gal’s,  I have worked my fat ass off on this today, and I need a little leisure time.  My poor old back is killing me,  hunched over this hot keyboard for gawd knows how long.

But I’ll be back on it tomorrow morning, and hopefully I will have a worthwhile daily update on what I have done.

Be seeing you, Steve.

RIG-First update

I have spent most of today trying to make a nice tabbed GUI for RIG.

I gave up on the tabs in the end.  It should of been easy, but I felt the day ebbing away without much progress so I decided to go with the easier flat GUI, that I already have much of the code written for.

So, here is the first look at the draught version of the GUI.

rig gui v1

It’s decent enough for a first attempt, you can see why I wanted tabs though can’t you!

Though the “engine” part of the code works, as you saw in my first RIG post, I haven’t linked the GUI to the code yet, that’s more than one days work for sure.

I’m already pondering about the help file that will need to go with it.   In my CBMan project I lazily just used a text file that opened in the default viewer, for RIG, I want the instructions displayed with an in-built method, no idea how that will work yet, but that’s my aim.

I have made quite a few changes to the original code.  I have removed some unneeded lines and split the code into two functions, although that might have been a mistake and  I may just have it all in one function in the end, it’s already caused me to use a global variable, which the experts just totally hate 🙂 so I do try to avoid.

A feature I totally forgot to mention in my other RIG post was that you can select the “post type”, for want of a better phrase.

I have no idea what they call the sub-subreddits, but they include, NEW, TOP, RISING, HOT, CONTROVERSIAL, GILDED and WIKI, which I have added the option to choose which one you want to grab from.

For example if you have already downloaded all the images in a subreddit, you could use “NEW” to grab the new posts.

I am also going to try a find a list of good subreddits that contain images, to include in the subreddit drop-down list, but the user can still type in any subreddit they want of course.

There’s no point in posting the code today as it’s mostly GUI stuff that you have seen before, so I will wait until a usable version is ready before posting it, probably in 2 or 3 days, if I don’t get stuck.

It’s been a long day, so I’ll catch you later, Steve.

Using Python V3.5.6 On Windows 7.

Update to the update!

After finishing this post I was going to bed (8am) when I couldn’t resist a last quick look around Reddit.  On the  “Learn Python”  subreddit a post caught my eye,  Need help with using PRAW

The guy there was stuck and I thought I had the answer, he couldn’t connect with the Reddit API because he left out his user name and password.

It did hit me as a bit odd to leave those out, but I always want to help others if I can, even though I realise as a beginner myself I could easily embarrass myself, and yes of course I did embarrass myself….

But, the upside was he taught me something valuable, if you are only reading i.e downloading from the Reddit API then you do not need to send a user name and password, they  are only required for writing to the API. Doh!

So although I was wrong, I gained some useful knowledge that nowhere else that I had read about the Reddit API had even bothered to mention, and now I have two less labels to code in my GUI 🙂

Actually, three less labels, as I might chance it and use a default User Agent as well.

The guy was still stuck with his problem, so I posted a working stand-alone non-gui version on pastebin for him and I hope he can work it out from there.

I tried his client ID and secret key in my working code and got the same error as he did, his keys are wrong, maybe the API banned his keys or he mis-pasted them, but I did eventually give him the correct answer, hurrah.

Live a learn.