Jul
18
SoC: AtomPub Week 8 Status
July 18, 2008 | Leave a Comment
Week 8 of the Summer of Code was nothing short of a roller coaster ride. At some points I wanted to tear my hair out, other times I was jumping for joy. Here is what went down this week:
Bad: Regular Expressions Are Not My Thing
While working towards media importing this week, I had to write a few regex statements. I’ll admit it, I completely suck at regular expressions. They bring me back to my not so fond days of Discrete Mathmatics; days I would rather forget. Regardless, I managed to get through them and ended with some working code.
Good: Faster Response Times From TypePad
I noticed while working on media importing, TypePad’s server have greatly increased in speed this week. I found that importing took roughly half the time it did in the previous week, with no code optimizations on my part. The speed increases managed to stay consistent through the week, so I’m hoping the changes are here to stay.
Bad: Media Importing is Near Impossible
Wednesday came to a grinding halt when I noticed there was no way to reliably import media by parsing images from posts. There are several reasons to this:
- TypePad has two different methods for uploading images.
- The markup TypePad generates for images is unclean and is not standardized. Users have the ability to modify the output code template.
- TypePad puts the full size image on a separate HTML page that is linked to via a URL that tells nothing about the original file. Due to this, it is near impossible to link the full size image with the originating link.
So, media importing is unfortunately looking very glum. I will try to come up with something since importing media is a crucial feature in my book, but I’m not sure what. More on this in the coming weeks.
Good: WordPress 2.6 Released
The new version of WordPress was released this week, and I was happy for several reasons. First off, I was able update my blog and take advantage of the new features, but more importantly, trunk is going bleeding again. Hopefully in the next week or so the TypePad importer should land in trunk. Stay on the lookout!
Good/Bad: Headway Made on Movable Type Authentication Issue
I managed to get in touch with the developer over at Six Apart who originally wrote Movable Type’s Atom API implementation. The good news is he confirmed I’m not crazy and MT is authenticating differently than TypePad. The bad news is he couldn’t remember off hand what is different with the implementation.
From what it sounds like, MT is not following the RFC 5023 spec in regards to WSSE, simply because the specification was not standardized when the original code was written. I’m not sure where this leaves me, because I don’t really know what is different with authentication at the moment. Also, I’m unsure if Movable Type will correct the authentication difference in the near future.
At the moment, it appears I will be brushing up on my Perl skills and looking at the source code for Movable Type next week. With a little luck, this issue will finally be ironed out. Who knows, maybe I’ll even figure out something for media importing next week as well. Right now, I’m just hoping next week is the week of miracles.
Jul
12
How to Make the iPhone a Trusted Platform
July 12, 2008 | Leave a Comment
The launch of Apple’s iPhone / iPod touch App Store appears to have been a great success. Apple managed to pull in several well known Mac, Palm, and game developers to contribute to the over 500 available applications at launch. Already there is an application to meet nearly every need, and with the majority of remaining developers being accepted into the developer program, I’m sure there will be more great applications in the coming days and weeks.
While the App Store and applications have been a huge hit, playing around with applications over the past two days has filled me with some worries. Worries that Apple will need to address if they want the iPhone to succeed as a platform.
Application Data
After trying out several applications yesterday, a major flaw in the way application data is stored became apparent. Application data (preferences, files, saved data such as games) are all stored directly linked with the application that created them. Therefore, if an application is uninstalled, everything that application ever created is cleaned up and throw into the ether. Sounds like a great way to make sure the iPhone stays uncluttered, no? Well it is, but keeping the iPhone clutter free brings problems.
Due to the nature of the application / data relationship, if an application is removed from the iPhone unintentionally, everything that application ever stored is removed permanently. Let me give you an example.
While messing with iTunes’ settings yesterday, I changed my iPod touch’s Application syncing preference to selective applications only. I forgot to select a few applications, and they were removed from my iPod touch the second I clicked apply. No big deal, right? I reselected the applications, and they appeared on my iPod touch with one exception - they were reset to their default settings and no longer contained my saved data. Thankfully, I only lost my Facebook login settings, my Flickr login settings, and my level 8 save game of Enigmo, but the results could have been much worse.
Looking at the list of applications currently available in the App Store, I would say 95% of applications would be fine with their data reset. Users would only loose some display settings, maybe a login or two, and that would be all. However, as the platform matures, more applications (and their users) will become reliant on stored data. Imagine finding out a year’s worth of mileage logs disappeared during your last iPhone’s restore. That could be a disaster.
Syncing
Part of this problem is due to the application / data relationship, but the bigger issue is the lack of a standardized syncing method in the iPhone OS. I would have no problem loosing data after a restore if that data could easily be added back, but at the moment there is no way to restore application data (yes, I’m aware iTunes currently stores a backup, but that is only of the most reason sync, and is no help if a single application looses its data).
Currently all of the native iPhone applications, with the exception of Notes and SMS, sync through an application that manages the iPhone’s stored data. Calendar items sync with iCal, addresses sync with Address Book, and so on. However, third-party applications are left to fend for themselves. Some application developers have cleverly worked around this by utilizing “the cloud” (great example is OmniFocus’ WebDAV sync), but applications without desktop counterparts are left stranded.
How to Trust the iPhone Platform
If Apple wants the iPhone platform to be trusted among businesses and consumers, they need to address these issues. Start backing up application data separate from the application itself. So, when the application is reinstalled, the data can be restored as well. With simple changes such as this, the iPhone will not only be the most innovate mobile platform today, the iPhone can become the most trusted platform as well.
Jul
11
SoC: AtomPub Week 7 Status
July 11, 2008 | Leave a Comment
In short, this past week of the Summer of Code has been a rebuilding and planning week. In addition to filling out midterm surveys and reports, I shifted gears toward media importing.
Why Import Media?
The first question I asked myself this week is why should TypePad’s media be imported into WordPress. This was a relatively easy question to answer. TypePad, being a paid service, will delete and remove all content when canceled. This includes images uploaded and stored in TypePad’s cloud. So, importing images and the like is important if a TypePad switcher does not want to see broken images all over the place.
Can TypePad Media Be Imported?
After determining media importing is essential, I explored my options for getting media into WordPress. Thankfully, TypePad has an AtomAPI for their web galleries. Unfortunately, from my testing this Atom API does not include single image uploads. So, the Atom API may be out of the question for media importing.
Since Atom API media importing appears out of the question, I started looking at their XML-RPC documentation. To no surprise, they do not have a method for retrieving media through XML-RPC. Therefore, I’m left with one option: finding media through URLs during the import process.
Sadly, this option is not ideal. For one, it will add time to the already slow import process. Also, this option will force media to be imported at the time of the initial import, since the content can only easily be traversed during the post import. So, I’m not too pleased with my options.
Next Week
Next week I plan on starting to code the media importing during the import process. I plan on making this optional, since it will most likely add significant time to the import process. However, before I do that, I will go through my options one more time. If anyone thinks of any alternatives from now until then, I’d love to hear them.
Jul
4
SoC: AtomPub Week 6 Status
July 4, 2008 | Leave a Comment
Time is flying. The conclusion of this week marks the midway point of the Summer of Code, and the supposed ready for core date I set back in the beginning of summer. So, how did I stack up?
This Week
This week saw the addition of Atom URI detection based on the straight blog URL. To detect the Atom URI, I have to parse the HTML page for the RSD page, then parse the RSD page for the Atom URI. It’s a multi-step process, but the most reliable should TypePad ever change the Atom URI on me. In fact, I’d like to mention that every single URI (comments, paging, etc) is automatically detecting within the importer, so the importer should be URI future proof.
In addition to AtomPub URI detection, I added a progress bar this week. I thought this would be a huge process, but thankfully the coding was relatively simple. I learned some techniques to force PHP to dump the output buffer, essentially updating the page. Paired with some simple Javascript, I was able to create a fairly responsive progress bar. I chose this method over a completely Javascript method since the Javascript method would require additional time to run (I can explain this technically if anyone wants to know why).
Overall Status
Since we’re at the midway point, I thought now would be a great time to look back at the overall status. As I mentioned earlier in the post, I choose this date as the ready for core inclusion date. Since the release cycle for WordPress 2.6 has been pushed up, core inclusion will most likely be pushed back since the AtomPub importer will not be ready for 2.6. Regardless of the actual core inclusion, the deadlines have not changed in my mind.
According my requirements for core inclusion, by now the importer should be converting AtomPub data in to actual WordPress data. That’s occurring, so I’m definitely still on track.
The Second Half
So, where’s the importer going from here? Well, I plan to start working on the media importing portion of the importer next week. I’m figuring that may take at least two weeks. By then, WordPress 2.6 should be near release, so core inclusion can be considered once trunk goes bleeding again. After the importer is included in core, I can start getting real feedback from users. That should allow me to find and fix bugs, in addition to working on speed enhancements on a much larger scale.
The Importer In Action
I thought I would leave you this week with a look at the importer action. A few notes for the video:
- The first item takes a while to display since the first 20 posts need to be requested, plus two requests for comments, in additional to the standard trackback and draft checks.
- I would like to add a throbber to the in progress page to make the wait for the first post less painful.
- You’ll notice the importer semi-stall after 20 posts. This is because it needs to request another group of 20 posts.
- By the time the average WordPress user sees this, I would really love to see the importer work ever faster than this, and will strive the second half to make that happen.
Jun
27
SoC: AtomPub Week 5 Status
June 27, 2008 | Leave a Comment
Another week down, another step closer to a working AtomPub importer. Unfortunately, week 5 went anything but according to plan. Sure, I fixed the bugs found at the end of last week, but new issues came to light, requiring changes in the week’s plan.
New Issues Found
Additional testing early in the week by my mentor Lloyd brought forth some coding challenges. First off, Lloyd found a few error messages on import. Those were quickly resolved, but once Lloyd made it past the error messages, he found the performance of the importing to be subpar.
After adding some performance measurements to the importer, the source of the problem was revealed. The multiple requests of different feeds of data adds up over time. Essentially, for every post the importer needs to ping the post URL to check for a 404 (draft status), request the comments feed, and request the trackbacks XML-RPC data. Each post was taking over a second, quickly adding up over time.
Progress, Progress, Progress
Unfortunately, nothing can be done at this time to lower the request time; the feed requests are at the mercy of the internet. However, the notifications can be enhanced so a user is not wondering why the importer has not finished.
So, after discussing the issue with Lloyd, I think a progress bar is needed in this situation. Unfortunately, due to the nature of PHP applications, I can’t just add a progress bar out of nowhere. I will need to modularize the importer into a more AJAXy interface, so an AJAX progress bar can be updated with the import status. I will begin looking at solutions for this later in the upcoming week.
Even More Issues
The performance issues was not my only problem this week. Lloyd found that when importing from a blog with 3,000 entries, the importer ran out of memory. Surprisingly, it ran out of memory around 130MB, which would be crazy under a normal web server, given PHP is typically limited to 16MB of RAM.
Once this issue was pointed out to me, I quickly found the problem. I had been putting all entries in a massive array before looping through them to import. So, to correct this I limited the importer to batches of 20 posts at a time, freeing the memory between each set of posts. This appears to have corrected the problem.
In addition to the memory leak, I found out that the comments feed has the same 20 comments as time restriction that the main feed had. Already familiar with the issue for the main feed, I corrected that issue and all comments started to be imported.
Outlook Looks Good
Despite the massive amounts of issues discovered this week, I think the future of the importer is looking better than ever. Some major hurdles were overcome this week, and because of that, this week ends with a more memory efficient, error-free version of the importer.
With the new discoveries, obviously the plan has been changed a bit. Currently, I’m looking at finally (and yes, I mean finally) writing the code to automatically detect the Atom API feed at the beginning of next week. From there, I will begin working on updating the interface to be more AJAXy, providing notifications along the way.
Jun
20
SoC: AtomPub Week 4 Status
June 20, 2008 | Leave a Comment
Now completing the fourth week of coding, the AtomPub importer is finally starting to take shape. This week I managed to retrieve trackbacks, successfully start importing the previously retrieved data into WordPress, and added a user interface enhancement.
Trackbacks Are Go
If you remember last week, I had some difficulties getting trackbacks working. Well, thankfully that is no longer the case. Earlier this week Joseph Scott helped me figure out the code needed for accessing TypePad’s XML-RPC API. With this addition, all standard blog data is now being imported.
Importing is Go
After solving the trackback dilemma, I started working on actually importing the array’s retrieved from the AtomPub server. Since much of the import code is shared between other WordPress importers, this process went fairly quickly. I only needed to make a few minor adjustments on some code, and by Thursday arrays were becoming rows in WordPress’ database.
Notifications Are Go
After testing importing several times, a logical enhancement occurred to me. Occasionally the AtomPub server can take a while to respond and feed the data into WordPress. During this time, a user would be sitting at the initial page with no notification of activity other than the browser’s loading notification. This event could have raised suspicions that the page was not loading, when in fact everything was working perfectly. So, I added a small throbber and message text while the importing occurs. The enhancement is small, but it should bring a piece of mind to those getting antsy.
![]()
Bugs Are Go
What would a coding project be without bugs. Today, my mentor Lloyd notified me of several small bugs. It turns out the importer has been generating warning messages left and right. I’ve been oblivious to this since apparently MAMP had PHP error reporting disabled. I have since enabled error reporting, and will start fixing the small bugs early next week.
Unfortunately, a major bug was also discovered. Lloyd found that only 20 entries will import. Since I’ve been working with only a few entries at a time, I had not run into this problem yet. I suspect the issue is with the AtomPub server, and an additional request will be needed for each set entries over 20. I’ll know more next week when I take a closer look at the issue.
Ronald is coding …and he has a plan.
I hope someone gets the above reference. Anyway, as you may have guessed, the early part of next week will focus solely on tackling the bugs found today. Hopefully the issues will not be problematic so I can begin the next task: allowing users to select an author to import all entries under. Once the author override code is committed, the next step on my coding agenda is to finally write the code to automatically the detect the AtomPub URL based on the website’s blog address. Those three items should keep me busy next week, and as always, I’ll let you know next week how the coding went.
Jun
15
Why Should Anyone Use Safari on Mac?
June 15, 2008 | Leave a Comment
Alright, that is a sensationalist title, but I needed a strong title to show my hypocrisy. Today, I have made the switch from Safari 3 to Firefox 3. I have realized despite the numerous advantages Safari has with direct operating system integration, Firefox still wins out feature-wise. To help make my decision, I made several lists of the advantages and disadvantages that matter to me. Below are those lists.
Advantages of Safari
- Operating system dictionary integration. Supporting shortcuts like dictionary lookup (command+shift+D).
- iPod touch bookmark syncing.
- Launches fairly fast and browsers pretty quickly.
- Has an amazing search plugin.
- Interface is completely “Mac-like”.
Disadvantages of Safari
- Flash currently chokes in Safari. Safari won’t crash, but can easily freeze for over a minute when viewing Flash content.
- After three operating system updates, the vanishing cookie bug remains.
- Some websites still won’t let you use Safari to fill out forms, etc.
Advantages of Firefox
- Lightning fast. Has not crashed or froze on me once since install.
- The amazing awesome bar.
- The bookmark “star” system works wonders. Very easy for temporarily bookmarking websites for later reference.
- When multiple tabs are open, the tab bar scrolls.
- Supports Google Gears.
- Extensions can fill any feature void.
Disadvantages of Firefox
- Interface is only partially Mac-like, even with GrApple.
- Spell checking is not as nice as Safari.
- Firefox 3 occasionally renders some pages strangly, due to the new text rendering.
Looking over the lists, Safari’s advantages are mostly in the interface, while the disadvantages can quickly become show stoppers. For Firefox, the advantages are in the features, while the disadvantages are only minor quibbles. When you enumerate the features, Firefox wins hands down - at least for me.
So, my final ruling is Firefox wins this browser round. If Safari 4 can manage to fix the Flash freezes and remember cookies, Safari has a good chance of winning round 4. Until then, Firefox will remain my browser of choice.
Advertisement: Advertise Your Site in This Space
Jun
13
SoC: AtomPub Week 3 Status
June 13, 2008 | Leave a Comment
Week 3 of the Summer of Code has been by far the most productive week yet. The main focus of this week was to parse the AtomPub data into a PHP array, and I’m pleased to say this was a success.
The XML Parser
I started the week off by writing the custom XML parser I talked about last week. To do this, I researched several different methods for utilizing PHP’s xml_parse function. Since the parsing occured on an established standard where the tag names will not change on me, I decided to parse the tags based on a tag name switch. This worked well until I started running into nested tags. Although, that problem was quickly resolved with the use of a few class variables.
Missing Data
Once the XML was in a parsed array, I began looking over the array and envisioned how this data would import into WordPress. During this process, I realized the AtomPub feed was missing two bits of key data: the draft status of posts and a list of trackbacks. I immediately began looking into possible workarounds.
While I investigated solutions, my mentor Lloyd discussed the missing data with Six Apart. We were assured support for app:draft was on their todo list, but they did not commit to any date for availability. So, Lloyd gave me the go ahead for workarounds.
To solve the draft problem, I ended up creating a 404 checker. Assuming that drafts will not be published, the URL for the post should result in a 404. Knowing this, while posts are imported I loop through the URLs and check the HTTP status codes. The workaround certainly isn’t the best as it’s resource intensive, but for the time being it works.
After fixing the draft problem, I looked into solutions for the missing trackbacks. I found this function on TypePad’s XML-RPC developer site, however, attempts to implement the function call have failed me. So, I continued to search for alternatives.
I found out today that Movable Type has a hidden RSS feed for trackbacks. I tested this and indeed is it true. Unfortunately, TypePad does not seem to have this feed. My guess is this is because of their premium pricing model, removing support for additional and custom templates in the lower tiers. If anyone happens to know the super-secret URL for a post’s trackbacks on TypePad, I would love to know, but I truly believe that feed does not exist.
Therefore, the search for a trackback solution continues. For the time being this is being put on the back-burner. When I get some free time over the next couple of weeks or during the second half of the Summer of Code I will revisit this problem, but at the moment trackback support is being forgone.
Next Week’s Plan
So, what’s up for next week? Early next week I plan on working on the actual importing of data into WordPress. All the arrays are prepped and the functions are ready, so the importing process should go fairly quickly. I’m actually anticipating finishing up the import code by the middle of next week, but if things don’t go to plan I have until the end of the following week. Should I finish early, I will revisit some of the priority two items accumulated over the past few weeks. With a little luck, next week will bring a functioning importer with some additional fixes.
Jun
6
SoC: AtomPub Week 2 Status
June 6, 2008 | Leave a Comment
While I didn’t blog about last week’s status, significant progress has been made in parsing TypePad and Movable Type AtomPub feeds (well, parsing TypePad feeds). This week started off by completing more research on the AtomPub spec. In order to parse an AtomPub feed, I had to learn about X-WSSE authentication. From there, I found a great X-WSSE class that I included in my test version of WordPress. Then, the fun began.
Movable Type Hates Me
Almost immediately, I was retrieving the RAW XML of TypePad’s AtomPub feed. Unfortunately, I could not say the same for Movable Type. Due to some server configuration issues on my end or possibly a bug in Movable Type, I am unable to retrieve Movable Type AtomPub feeds at the moment. I’ve tried various methods of parsing the feed and each method returns the same cryptographic error message. I’ve called in the experts (my mentor, Lloyd) to help me, but if anyone has any clue as to why Movable Type hates me, I would appreciate the feedback.
Parsing the Feed
Regardless of the error, I kept on trucking with the TypePad AtomPub feed, knowing that Movable Type will fall in line once I can figure out my retrieval problems. Working with TypePad’s feed, I started trying different methods of Atom parsing. I first tried WordPress’ built in Magpie parser, but due to Magpie not supporting Atom 1.0, that was little help. I then tried some code snippets on php.net, but unfortunately none of the snippets parsed in the manner I required. So, I started writing my own basic XML parser.
That’s where I stand today. This coming week I will continue writing my custom XML parser, completing the parser by the end of next week. The goal is to have all AtomPub data in array, so I certainly have my work cut out for me.
May
26
Summer of Code 2008 Kickoff
May 26, 2008 | Leave a Comment
Google’s Summer of Code 2008 is officially underway! This year I am working on creating an AtomPub-based content importer for WordPress. The goal is to import entries and other content from Movable Type and TypePad into WordPress in as few clicks as possible.
Since the AtomPub spec (RFC 5023) is so new, this should prove to be an interesting summer. I will be one of the first to implement a real world use of AtomPub, and I suspect documentation will be scarce. Regardless, I am up for the challenge and can’t wait to see how the end product turns out.
If you’re interested in the progress of this project, just stayed tuned to this blog. I will be blogging weekly updates on my progress, so you will always know where I stand.
Blogroll
- Beverly Hills, Ca. Real Estate - Nick Lapin - Coldwell Banker - Beverly Hills
- Bill Hartzer
- BONTB(Blog Or Not To Blog)
- cavemonkey50.com
- Chicago Party Planner - FiestaGurl is a party planner that specializes in kids birthdays, and graduation parties.
- Chicago Website Design
- EquityBeGone.com - Free Real Estate Classifieds - A free real estate classifieds listing website.
- Free phone calls and SMS - Call or send SMS anywhere in the world free
- Lingerie - Sells men’s, women’s, and plus size lingerie.
- Make money online - No investment required - Want to make money online with zero out of pocket investment? Sign up FREE.
- National & International Movers
- New Easy Recipe - Free recipe website full of tasty goodies.
- Nicadema Directory - A human edited directory
- Over 3,000 channels on your computer screen - Get over 3,000 satellite tv channels on your computer screen.
- RankRover - RankRover is an seo company based in Illinois. They guarantee top 10 rankings using white hat seo.
- SEO Book.com -
- Stefan Juhl