This blog is all about Cyber Security and IT

Wednesday, February 13, 2019

How Python helped me to automate my Cyber Stuff



I am writing this post after completion of 3 weeks of learning Python and guys I am seriously impressed with this language .





You know best part is that this language help you to talk with almost any API and have lots of modules in it . Like





  • Webbrowser Comes with Python and opens a browser to a specific page.
  • Requests. Downloads files and web pages from the Internet.
  • Beautiful Soup. Parses HTML, the format that web pages are written in.
  • Selenium. Launches and controls a web browser. Selenium is able to fill in forms and simulate mouse clicks in this browser.




Web Browser Module





The webbrowser module’s open() function can launch a new browser to a specified URL. Enter the following into the interactive shell:





>>> import webbrowser >>> webbrowser.open('http://cyberknowledgebase.com/')




Downloading Files from the Web with the requests Module





The requests module lets you easily download files from the Web without having to worry about complicated issues such as network errors, connection problems, and data compression. The requests module doesn’t come with Python, so you’ll have to install it first. From the command line, run pip install requests.





>>> import request




sIf no error messages show up, then the request module has been successfully installed.





Downloading a Web Page with the requests.get() Function





The requests.get() function takes a string of a URL to download. By calling type() on requests.get()’s return value, you can see that it returns a Response object, which contains the response that the web server gave for your request. I’ll explain the Response object in more detail later, but for now, enter the following into the interactive shell while your computer is connected to the Internet:





>>> import requests    
>>> res = requests.get('Enter URL from which you want to download')
>>> type(res)
>>> res.status_code == requests.codes.ok True >>>




Checking for Errors





As you’ve seen, the Response object has a status_code attribute that can be checked against requests.codes.ok to see whether the download succeeded. A simpler way to check for success is to call the raise_for_status() method on the Response object. This will raise an exception if there was an error downloading the file and will do nothing if the download succeeded. Enter the following into the interactive shell:





>>> res = requests.get('http://cyberknowledgebase.com') >>> res.raise_for_status() Traceback (most recent call last):   File "<pyshell#138>", line 1, in <module>     res.raise_for_status()   File "C:\Python34\lib\site-packages\requests\models.py", line 773, in raise_for_status     raise HTTPError(http_error_msg, response=self) requests.exceptions.HTTPError: 404 Client Error: Not Found




The raise_for_status() method is a good way to ensure that a program halts if a bad download occurs. This is a good thing: You want your program to stop as soon as some unexpected error happens. If a failed download isn’t a deal breaker for your program, you can wrap the raise_for_status() line with try and except statements to handle this error case without crashing.





import requests res = requests.get('http://inventwithpython.com/page_that_does_not_exist') try:     res.raise_for_status() except Exception as exc:     print('There was a problem: %s' % (exc))




This raise_for_status() method call causes the program to output the following:





There was a problem: 404 Client Error: Not Found




Always call raise_for_status() after calling requests.get(). You want to be sure that the download has actually worked before your program continues.





Saving Downloaded Files to the Hard Drive





From here, you can save the web page to a file on your hard drive with the standard open() function and write() method. There are some slight differences, though. First, you must open the file in write binary mode by passing the string 'wb' as the second argument to open(). Even if the page is in plaintext (such as the Romeo and Juliet text you downloaded earlier), you need to write binary data instead of text data in order to maintain the Unicode encoding of the text.





To write the web page to a file, you can use a for loop with the Response object’s iter_content() method.





>>> import requests >>> res = requests.get('https://cyberknowledgebase.com/abc.txt') >>> res.raise_for_status() >>> playFile = open('abc.txt', 'wb') >>> for chunk in res.iter_content(100000):         playFile.write(chunk) 100000 78981 >>> playFile.close()




The iter_content() method returns “chunks” of the content on each iteration through the loop. Each chunk is of the bytes data type, and you get to specify how many bytes each chunk will contain. One hundred thousand bytes is generally a good size, so pass 100000 as the argument to iter_content().





The file abc.txt will now exist in the current working directory. Note that while the filename on the website was xxx.txt, the file on your hard drive has a different filename. The requests module simply handles downloading the contents of web pages. Once the page is downloaded, it is simply data in your program. Even if you were to lose your Internet connection after downloading the web page, all the page data would still be on your computer.





The write() method returns the number of bytes written to the file.





To review, here’s the complete process for downloading and saving a file:





  1. Call requests.get() to download the file.
  2. Call open() with 'wb' to create a new file in write binary mode.
  3. Loop over the Response object’s iter_content() method.
  4. Call write() on each iteration to write the content to the file.
  5. Call close() to close the file.




That’s all there is to the requests module! The for loop and iter_content() stuff may seem complicated compared to the open()/write()/close() workflow you’ve been using to write text files, but it’s to ensure that the requests module doesn’t eat up too much memory even if you download massive files.





Example to download X force data





import requests, sys, bs4, webbrowser
print("X force results")
x = input("Enter the IP\n")
res = requests.get('https://exchange.xforce.ibmcloud.com/search/' + x)
res.raise_for_status()
print (res)
malware_response = open('res', 'wb')
for malware_data in res.iter_content(10000000000):
malware_response.write(malware_data)





Parsing HTML with the BeautifulSoup Module





Beautiful Soup is a module for extracting information from an HTML page (and is much better for this purpose than regular expressions). The BeautifulSoup module’s name is bs4 (for Beautiful Soup, version 4). To install it, you will need to run pip install beautifulsoup4 from the command line. To import Beautiful Soup you run import bs4.









<!-- This is the example.html example file. --> <html><head><title>Cyber Knowledge Base</title></head> <body> <p>Download my <strong>Python</strong> book from <a href="http:// cyberknowledgebase.com">my website</a>.</p> <p class="slogan">Learn Python the easy way!</p> <p>By <span id="author">Al Davinder</span></p> </body></html>




As you can see, even a simple HTML file involves many different tags and attributes, and matters quickly get confusing with complex websites. Thankfully, Beautiful Soup makes working with HTML much easier.





Creating a BeautifulSoup Object from HTML





The bs4.BeautifulSoup() function needs to be called with a string containing the HTML it will parse. The bs4.BeautifulSoup() function returns is a BeautifulSoup object. Enter the following into the interactive shell while your computer is connected to the Internet:





>>> import requests, bs4 >>> res = requests.get('https://cyberknowledgebase.com') >>> res.raise_for_status() >>> noStarchSoup = bs4.BeautifulSoup(res.text) >>> type(noStarchSoup) <class 'bs4.BeautifulSoup'>




This code uses requests.get() to download the main page from my website and then passes the text attribute of the response to bs4.BeautifulSoup(). The BeautifulSoup object that it returns is stored in a variable named noStarchSoup.





You can also load an HTML file from your hard drive by passing a File object to bs4.BeautifulSoup(). Enter the following into the interactive shell (make sure the example.html file is in the working directory):





>>> exampleFile = open('example.html') >>> exampleSoup = bs4.BeautifulSoup(exampleFile) >>> type(exampleSoup) <class 'bs4.BeautifulSoup'>




Once you have a BeautifulSoup object, you can use its methods to locate specific parts of an HTML document.





The requests and BeautifulSoup modules are great as long as you can figure out the URL you need to pass to requests.get(). However, sometimes this isn’t so easy to find. Or perhaps the website you want your program to navigate requires you to log in first. The selenium module will give your programs the power to perform such sophisticated tasks.





Controlling the Browser with the selenium Module





Importing the modules for Selenium is slightly tricky. Instead of import selenium, you need to run from selenium import webdriver. After that, you can launch the Firefox browser with Selenium. Enter the following into the interactive shell:





>>> from selenium import webdriver >>> browser = webdriver.Firefox() >>> type(browser) <class 'selenium.webdriver.firefox.webdriver.WebDriver'> >>> browser.get('http://cyberknowledgebase.com')








After calling webdriver.Firefox() and get() in IDLE, the Firefox browser appears.









NOTE: FOR COMPLETE INFO ABOUT THESE MODULE , SEARCH ON GOOGLE AS MY MOTIVE IS TO PROVIDE YOU THE INFORMATION FOR THE THINGS WHICH ARE VERY USEFUL FOR AUTOMATION





My First contribution to Company





Challenge- My daily task in company involved one boring task in which i need to check for the source IP reputation on web . For which I need to copy the source IP and than check on the first website , than on the second and than so on .....





Automation Done :





To make this task easier , I have written script which will automatically fetch data from the mail ... like IP detail for which we need to analyses reputation . Than Browser automatically get all the details for me with just one click





To make it , I followed the below steps :





1: import modules : win32com.client, sys, os , requests, re, webbrowser
win32com:





2: Get access to Outlook





3: Read mail





4: Get the IP extracted with Regex





5: Use that IP to search on browser






0 comments:

Post a Comment