Home  |  Forums  |  914 Info  |  Blogs
 
914World.com - The fastest growing online 914 community!
 
Porsche, and the Porsche crest are registered trademarks of Dr. Ing. h.c. F. Porsche AG. This site is not affiliated with Porsche in any way.
Its only purpose is to provide an online forum for car enthusiasts. All other trademarks are property of their respective owners.
 

Welcome Guest ( Log In | Register )

2 Pages V  1 2 >  
Reply to this topicStart new topic
> SOT: Programmers and Database Builders, Last bump.... Bummer
McMark
post May 7 2015, 10:45 AM
Post #1


914 Freak!
***************

Group: Retired Admin
Posts: 20,179
Joined: 13-March 03
From: Grand Rapids, MI
Member No.: 419
Region Association: None



I'd like to transcribe the PET file into a usable database. I've tried a few times to undertake this project myself, but it's daunting. So I realized that we could set up a site where our members could add a little bit of the data at a time. With everyone's help, it'll be done in no time. Having this data available will enable future developments, such as adding real pictures of the parts, better how-to threads, linking to part numbers in posts, etc. Andy and I have both planned on setting this up, but neither of us has actually found the time to get started. Since this doesn't really need to be tied into the 914World forum in any way, we don't need to build it on the forum servers. We can set this up independently and then import the completed database file when we're done...

Anyone interested in helping with this project? Here's a bit of overview on what I had planned:

***Split the PET file into JPG/GIF files***
I planned on splitting the file up into usable image files, which could also be stored in the database (it's own table?). The tricky part, is that besides splitting the PDF by page numbers, we also have to split SOME of the pages in half.

***Phase 1 Data Entry***
This one is more simple, just build a page that will display one of the PET images (not the exploded diagrams, just the parts list) and display a HTML form that matches the formatting, so a user could log in, and transcribe line by line as much of the image as they felt like. The form should save the data automatically (AJAX) so the user doesn't have to complete a page, or remember to click save, etc. This means that when a user 'starts work' they could be presented with a partially complete image to add data to. In that case, it would also be useful to add a checkbox at the end of each line used to indicate that the previous work has been double-checked and is correct. Once all of the data is entered, new requests for 'work' would be presented with completed images for double-checking. Once a line has been triple-checked, it could be locked as accurate. Eventually we would have all the data transferred and triple checked.

***Phase 2 Real World Descriptions***
Since a lot of the listing in the PET are translated from German incorrectly, it would be worthwhile to go through all the listings again to translate them. This would be a slightly different process from above. We would display an exploded diagram and the details for that image from the database, not from the PET images. The only form field would be an [i]additional[i] field for a new description. I think it would be useful to maintain and original listing of the description from the PET, as well as our own description. It would also be useful to collect multiple descriptions, which may not be shown publicly, but would be useful for searching for parts. For something like the 'Taco plate', it's listed in the PET as 'cover for oil sump' but everyone knows it as a taco plate. But it could also be called an oil temp sender plate. All of these descriptions would be useful for searching.

***Phase 3 Further Expansion***
This phase is probably where the project would end and the data integrated into the forum software, and future development handled by Andy or myself. But in order to describe the full process, I've included it here. This phase would be where members could add pictures of the parts (alone or on the car), as well as things like original finishes (paint, plating, etc), manufacture materials, possible replacements (using 911 Sport Mounts instead of Transmission Mounts).


Attached thumbnail(s)
Attached Image
User is offlineProfile CardPM
Go to the top of the page
+Quote Post
SirAndy
post May 7 2015, 10:55 AM
Post #2


Resident German
*************************

Group: Admin
Posts: 41,626
Joined: 21-January 03
From: Oakland, Kalifornia
Member No.: 179
Region Association: Northern California



- Is this in a PDF?
- If so, is the text on the right accessible?
- If so, it can be scraped, parsed and then put into a spreadsheet/database.

(IMG:style_emoticons/default/type.gif)
User is offlineProfile CardPM
Go to the top of the page
+Quote Post
type47
post May 7 2015, 11:02 AM
Post #3


Viermeister
****

Group: Members
Posts: 4,254
Joined: 7-August 03
From: Vienna, VA
Member No.: 994
Region Association: MidAtlantic Region



Have you seen the Parts Vault on this site (sub category of Originality and History)? Maybe something there related to your project...
User is offlineProfile CardPM
Go to the top of the page
+Quote Post
7TPorsh
post May 7 2015, 11:17 AM
Post #4


7T Porsh
****

Group: Members
Posts: 2,691
Joined: 27-March 06
From: Glendale Ca
Member No.: 5,782
Region Association: Southern California



Maybe set it up like a Wikipedia site. Dump all the part numbers in and everyone has a shot at updating it.
User is offlineProfile CardPM
Go to the top of the page
+Quote Post
gms
post May 7 2015, 11:35 AM
Post #5


Advanced Member
****

Group: Members
Posts: 2,695
Joined: 12-March 04
From: Chicagoland
Member No.: 1,785
Region Association: Upper MidWest



I put all the parts numbers and descriptions in a database about 20 years ago, I will see if I can find the floppy disk (IMG:style_emoticons/default/biggrin.gif) that it is on
User is offlineProfile CardPM
Go to the top of the page
+Quote Post
BeatNavy
post May 7 2015, 11:42 AM
Post #6


Certified Professional Scapegoat
****

Group: Members
Posts: 2,924
Joined: 26-February 14
From: Easton, MD
Member No.: 17,042
Region Association: MidAtlantic Region



This is a very cool idea.
QUOTE(SirAndy @ May 7 2015, 12:55 PM) *

- Is this in a PDF?
- If so, is the text on the right accessible?
- If so, it can be scraped, parsed and then put into a spreadsheet/database.

(IMG:style_emoticons/default/type.gif)

It is PDF. It is the kind where the text is accessible. I have Acrobat (full) and tried exporting it, but it seems to be locked down in terms of what you are allowed to do. It would not let me save as a Rich Text File or perform any sort of export operation. I'm sure one could eventually figure out the password to lift the security settings, but I can't do anything with it (at least the version I have).
User is offlineProfile CardPM
Go to the top of the page
+Quote Post
McMark
post May 7 2015, 12:46 PM
Post #7


914 Freak!
***************

Group: Retired Admin
Posts: 20,179
Joined: 13-March 03
From: Grand Rapids, MI
Member No.: 419
Region Association: None



Here's the extracted text. The problem is that it's dumped one column at a time, without reference to which row it applies to. So the model designations all come out in a chunk, but not every row has a model designation. Not only that but I selected a page and tried to make sense of the order of the output, but couldn't.


Attached File(s)
Attached File  Extract_Text_Output.txt ( 777.66k ) Number of downloads: 1221
User is offlineProfile CardPM
Go to the top of the page
+Quote Post
stevegm
post May 7 2015, 01:14 PM
Post #8


Advanced Member
****

Group: Members
Posts: 2,111
Joined: 14-July 14
From: North Carolina
Member No.: 17,633
Region Association: South East States



QUOTE(SirAndy @ May 7 2015, 12:55 PM) *

- Is this in a PDF?
- If so, is the text on the right accessible?
- If so, it can be scraped, parsed and then put into a spreadsheet/database.

(IMG:style_emoticons/default/type.gif)



I agree. I can have one of the programmers that works for me do this if you like.
User is offlineProfile CardPM
Go to the top of the page
+Quote Post
Andyrew
post May 7 2015, 01:29 PM
Post #9


Spooling.... Please wait
**********

Group: Members
Posts: 13,376
Joined: 20-January 03
From: Riverbank, Ca
Member No.: 172
Region Association: Northern California



Jpegs can be converted to PDF pretty easily...

I do this all the time with PDF/TIF plan pages and door schedules... Extract the data into a workable excel sheet...

How many pages is this PET file?
User is offlineProfile CardPM
Go to the top of the page
+Quote Post
McMark
post May 7 2015, 01:54 PM
Post #10


914 Freak!
***************

Group: Retired Admin
Posts: 20,179
Joined: 13-March 03
From: Grand Rapids, MI
Member No.: 419
Region Association: None



QUOTE(Andyrew @ May 7 2015, 12:29 PM) *

How many pages is this PET file?

330, but accuracy is very important
User is offlineProfile CardPM
Go to the top of the page
+Quote Post
bandjoey
post May 7 2015, 02:00 PM
Post #11


bandjoey
****

Group: Members
Posts: 4,925
Joined: 26-September 07
From: Bedford Tx
Member No.: 8,156
Region Association: Southwest Region



I think it's a great idea but as usual with P----- proceed with caution before spending a lot of time. Remember Pelican used exploded pop up PET pages on their site and P----- made them take it down. Come up with a secret web name so it won't attract Google attention. Etc.
User is offlineProfile CardPM
Go to the top of the page
+Quote Post
SixerJ
post May 7 2015, 02:33 PM
Post #12


Member
**

Group: Members
Posts: 448
Joined: 24-June 13
From: UK
Member No.: 16,042
Region Association: England



Really cool idea, a while ago I transcribed the 914-6 GT Parts manual to excel. More than happy to share / tack to the back end of the PET project?
User is offlineProfile CardPM
Go to the top of the page
+Quote Post
McMark
post May 8 2015, 10:27 AM
Post #13


914 Freak!
***************

Group: Retired Admin
Posts: 20,179
Joined: 13-March 03
From: Grand Rapids, MI
Member No.: 419
Region Association: None



(IMG:style_emoticons/default/icon_bump.gif) Anyone want to take lead on this? I was hoping for some actual action/progress on this project. (IMG:style_emoticons/default/wink.gif)
User is offlineProfile CardPM
Go to the top of the page
+Quote Post
McMark
post May 8 2015, 09:49 PM
Post #14


914 Freak!
***************

Group: Retired Admin
Posts: 20,179
Joined: 13-March 03
From: Grand Rapids, MI
Member No.: 419
Region Association: None



Okay, last (IMG:style_emoticons/default/icon_bump.gif)

I thought we would get some help here. (IMG:style_emoticons/default/sad.gif)
User is offlineProfile CardPM
Go to the top of the page
+Quote Post
Mike Bellis
post May 8 2015, 10:51 PM
Post #15


Resident Electrician
*****

Group: Members
Posts: 8,345
Joined: 22-June 09
From: Midlothian TX
Member No.: 10,496
Region Association: None



I'm unlocking it and running text recognition as we speak...

I mean, no that's not what I'm doing... (IMG:style_emoticons/default/biggrin.gif)

My dumputer is running sloooww right now...
User is offlineProfile CardPM
Go to the top of the page
+Quote Post
Mike Bellis
post May 8 2015, 11:24 PM
Post #16


Resident Electrician
*****

Group: Members
Posts: 8,345
Joined: 22-June 09
From: Midlothian TX
Member No.: 10,496
Region Association: None



Sure is taking a long time... (IMG:style_emoticons/default/sad.gif)

I'll attach it here when complete.
User is offlineProfile CardPM
Go to the top of the page
+Quote Post
Mike Bellis
post May 9 2015, 09:05 AM
Post #17


Resident Electrician
*****

Group: Members
Posts: 8,345
Joined: 22-June 09
From: Midlothian TX
Member No.: 10,496
Region Association: None



Here it is, in all it's glory.

Unlocked and word searchable. The original file was 8.4MB the file size expanded to 130MB after I finished. Here is a link to download it as it's too big for this site.

https://app.box.com/s/vupfsixyln4bfn3sya1wl1vaaf1noo8j

Now what? (IMG:style_emoticons/default/confused24.gif)
User is offlineProfile CardPM
Go to the top of the page
+Quote Post
Mike Bellis
post May 9 2015, 09:31 AM
Post #18


Resident Electrician
*****

Group: Members
Posts: 8,345
Joined: 22-June 09
From: Midlothian TX
Member No.: 10,496
Region Association: None



Now I'm converting it to a word doc to see what it looks like.

I'm using a program called Bluebeam Revu. It is way more powerful than Acrobat and has a better word doc generator. I will post it to the same link when ready.
User is offlineProfile CardPM
Go to the top of the page
+Quote Post
altitude411
post May 9 2015, 09:38 AM
Post #19


I drove my 6 into a tree
***

Group: Members
Posts: 1,306
Joined: 21-September 14
From: montana
Member No.: 17,932
Region Association: Rocky Mountains



(IMG:style_emoticons/default/cheer.gif) (IMG:style_emoticons/default/cheer.gif) (IMG:style_emoticons/default/cheer.gif) Operation " black ops" is on! Way to go Mike.
User is offlineProfile CardPM
Go to the top of the page
+Quote Post
ConeDodger
post May 9 2015, 09:59 AM
Post #20


Apex killer!
***************

Group: Members
Posts: 23,581
Joined: 31-December 04
From: Tahoe Area
Member No.: 3,380
Region Association: Northern California



Mark,
I have an original Porsche hard copy if you want to start from scratch and re digitize it...
User is offlineProfile CardPM
Go to the top of the page
+Quote Post

2 Pages V  1 2 >
Reply to this topicStart new topic
1 User(s) are reading this topic (1 Guests and 0 Anonymous Users)
0 Members:

 



- Lo-Fi Version Time is now: 11th May 2024 - 06:32 AM