![]() The shell script grabs the URL out of the file & sends a request to the Diffbot service, which saves the result to the /tmp directory as a webpage.Hazel on your Mac notices the new file in the folder & runs a shell script.IFTTT saves the email as a file in a specific folder in your Dropbox. ![]() Send an email to the IFTTT (If This Then That) service which contains the URL of the webpage at the top of the message.Don’t worry I’ll explain everything below. I’m going to mention several services in this introduction that you might not have heard of. It might seem complicated, & I guess it kinda is, but it’s not that bad if you go through it step by step, & it does work beautifully. After testing it for months to verify that it works well, I am now ready to unveil this process to you, the readers of Chainsaw on a Tire Swing.īefore I dig in to the details, let me give you the 20,000-foot summary of the process. So I set out to create it over several months, & I finally got it all figured out & set up & working this summer. Yes, I know this is picky, but it’s what I really want. See? Neat & clean, with the title of the Web article at the top as an H1, & then the author, date of publication, & URL below, all H2s in the HTML, & finally the content & nothing else. If I check the box next to Instapaper, I get less junk, but I lose a lot of control over what gets selected & what doesn’t get selected, & the original URL of the webpage, along with a lot of other important metadata, gets stripped away by Instapaper. But if I choose the HTML Page or Web Archive options, then I get a bunch of junk I don’t want, like ads & extraneous content. I like PDFs, but not when I can just have good ol’ HTML to deal with. Here’s my problem: I want a webpage so that I can see images & hyperlinks & other stuff that only comes with the Web. But even with Instapaper, these results are not perfect, at least for me. This is great, as is the checkbox for Instapaper, which runs the webpage through that awesome service & gives you results with just the featured content & none of the crap. See the Format menu? When you click it, you get several choices: You click on the extension, & you get a small window: Now, it’s very easy to get webpages into DEVONthink by using the browser extensions that come with the software. In particular, I use it to store copies of webpages that I run across that I want students to read or that I want to refer back to for teaching, or for writing, or for my own use. How to save a perfectly-scraped webpage into DEVONthink using IFTTT, Diffbot, Hazel, & several command line toolsĭEVONthink is a key piece of software for me on my Mac.
0 Comments
Leave a Reply. |