PUZZLE/PROBLEM: Save content of an opened tab in chrome, site no longer exists.

OK so … I haven’t posted here in a while. When I came across this problem I knew it would find it’s solution on OT.

So here it is:

A while back I opened some tabs in Google Chrome. (OSX ML)
…Since then (40+ days or so) the website in which I opened the tabs has gone down. I would like to retain the information as it is possibly the last known archives of some of these pages. Even the Web Archives has removed the sites history from it’s records.

I have tried the standard SAVE AS ( Complete, Website) etc… This does not work because when you click this it does not save the version cached or that is open but rather requests the page and assets again from the server. So when I do this the files are blank. All assets dead links. Womp Womp. Any thoughts? I was thinking Screenshots but this would be tedious and wouldn’t save the images at full resolution etc. I tried saving via the Web Inspector and can get the HTML, but what good is that without the images

Thoughts… ?

I wish it was that easy… most of what I want is the images.

Oh, I just realized you said you’re on OSX.

Use this method instead. It’s much more tedius, but it will get the job done.

1. Open up

2. Click on a file you want to download.

3. Press ctrl+shift+j (cmd+shift+j on osx?) to open Javascript console

4. Copy/paste this into the console.

!function(){var preTags=document.getElementsByTagName("pre");var preWithHeaderInfo=preTags[0];var preWithContent=preTags[2];var lines=preWithContent.textContent.split("n");var rgx=/^(0{8}:s+)([0-9a-f]{2}s+)[0-9a-f]{2}/m;var match=rgx.exec(lines[0]);var text="";for(var i=0;i<lines.length;i++){var line=lines[i];var firstIndex=match[1].length;var indexJump=match[2].length;var totalCharsPerLine=16;index=firstIndex;for(var j=0;j<totalCharsPerLine;j++){var hexValAsStr=line.substr(index,2);if(hexValAsStr=="  "){break}var asciiVal=parseInt(hexValAsStr,16);text+=String.fromCharCode(asciiVal);index+=indexJump}}var headerText=preWithHeaderInfo.textContent;var elToInsertBefore=document.body.childNodes[0];var insertedDiv=document.createElement("div");document.body.insertBefore(insertedDiv,elToInsertBefore);var nodes=[document.body];var filepath="";while(true){var node=nodes.pop();if(node.hasChildNodes()){var children=node.childNodes;for(var i=children.length-1;i>=0;i--){nodes.push(children[i])}}if(node.nodeType===Node.TEXT_NODE&&/S/.test(node.nodeValue)){filepath=node.nodeValue;break}}outputResults(insertedDiv,convertToBase64(text),filepath,headerText);insertedDiv.appendChild(document.createElement("hr"));function outputResults(parentElement,fileContents,fileUrl,headerText){var rgx=/.+/([^/]+)/;var filename=rgx.exec(fileUrl)[1];rgx=/content-type: (.+)/i;var match=rgx.exec(headerText);var contentTypeFound=match!=null;var contentType="text/plain";if(contentTypeFound){contentType=match[1]}var dataUri="data:"+contentType+";base64,"+fileContents;var gZipRgx=/content-encoding: gzip/i;if(gZipRgx.test(headerText)){filename+=".gz"}var imageRgx=/image/i;var isImage=imageRgx.test(contentType);var aTag=document.createElement("a");aTag.textContent="Left-click to download the cached file";aTag.setAttribute("href",dataUri);aTag.setAttribute("download",filename);parentElement.appendChild(aTag);parentElement.appendChild(document.createElement("br"));if(isImage){var imgTag=document.createElement("img");imgTag.setAttribute("src",dataUri);parentElement.appendChild(imgTag);parentElement.appendChild(document.createElement("br"))}if(!contentTypeFound){var pTag=document.createElement("p");pTag.textContent="WARNING: the type of file was not found in the headers... defaulting to text file.";parentElement.appendChild(pTag)}}function getBase64Char(base64Value){if(base64Value<0){throw"Invalid number: "+base64Value}else if(base64Value<=25){return String.fromCharCode(base64Value+"A".charCodeAt(0))}else if(base64Value<=51){base64Value-=26;return String.fromCharCode(base64Value+"a".charCodeAt(0))}else if(base64Value<=61){base64Value-=52;return String.fromCharCode(base64Value+"0".charCodeAt(0))}else if(base64Value<=62){return"+"}else if(base64Value<=63){return"/"}else{throw"Invalid number: "+base64Value}}function convertToBase64(input){var remainingBits;var result="";var additionalCharsNeeded=0;var charIndex=-1;var charAsciiValue;var advanceToNextChar=function(){charIndex++;charAsciiValue=input.charCodeAt(charIndex);return charIndex<input.length};while(true){var base64Char;if(!advanceToNextChar())break;base64Char=charAsciiValue>>>2;remainingBits=charAsciiValue&3;result+=getBase64Char(base64Char);additionalCharsNeeded=3;if(!advanceToNextChar())break;base64Char=remainingBits<<4|charAsciiValue>>>4;remainingBits=charAsciiValue&15;result+=getBase64Char(base64Char);additionalCharsNeeded=2;if(!advanceToNextChar())break;base64Char=remainingBits<<2|charAsciiValue>>>6;result+=getBase64Char(base64Char);remainingBits=charAsciiValue&63;result+=getBase64Char(remainingBits);additionalCharsNeeded=0}if(additionalCharsNeeded==2){remainingBits=remainingBits<<2;result+=getBase64Char(remainingBits)+"="}else if(additionalCharsNeeded==3){remainingBits=remainingBits<<4;result+=getBase64Char(remainingBits)+"=="}else if(additionalCharsNeeded!=0){throw"Unhandled number of additional chars needed: "+additionalCharsNeeded}return result}}();

5. Download your file, rinse and repeat for each file.

source:

Awesome attempt my friend! Valiant effort!

Unfortunately the combination of the age of the tabs in question as well as the fact that they are Incognito Tabs make Step 2 an impossible feat. The urls for these tabs are not in the chrome://cache list :/

Ah -The Struggle.

What happens if you drag&drop each image? Does it try to retrieve a new copy?

Actually, never mind.

Go to chrome://cache in a new tab in the same incognito window.

I just tested it. It works.

Does not work for me because I have opened too many tabs or it has been too long since then my chrome://cache doesn’t even have all of the URLs that I was at yesterday ..let alone 40+ days ago

To answer your 1st question, yes dragging and dropping them to the desktop DOES work!!! I had not tried that! This makes me wonder if I D/L the HTML and then replace the missing img’s with the dragged images, this might work.. will keep ya posted.

Try reading the first post, brah

OP states this does not work as the site was removed from archive as well… these tabs I have open may be the last remaining archives ..

copy & paste into word- save as html

might get you close.. then you could use a screen cap to clean up the html