afuna_archive: (Default)
I feel content right now. Finally figured out why entries in syndicated accounts occasionally don't seem to follow logic in their ordering and deletion. The entries should ideally be in reverse chronological order, and entries older than fourteen days are deleted. However, if you visit the actual recent entries page of a syndicated account, you'll occasionally see that some entries are out of order, or that there are some entries that were posted to the feed more than fourteen days ago.

No one really notices it, because most feeds update far enough apart that you only ever get one new entry per fetch, or it doesn't really matter what order they appear on your friendslist as long as they show up. And who's going to complain about keeping older entries around?

It bothered me though, because there was no apparent logic behind the (re)ordering and the nondeletion. It wasn't being ordered by the date shown on the entry, and it wasn't being ordered by the entry id either. In fact, it didn't seem to be ordered by anything that I could see on the page (that's because it in fact wasn't ;)).

The other weird thing was that the reordering/nondeletion was usually accompanied by a complaint of the feed not updating. I thought at first that there was something wrong with the parsing of the feed, which would account for the misordering (the dates were not being recognized perhaps?), and which would explain why the feed wasn't updating properly. Turns out that it was the other way around: the feed not updating is the cause for the misordering and the apparent late deletion.

In a word: "logtime"
> plus two words: "vs. entrytime" (display time? eventtime?)
In twelve words: "ordered by logtime; entries fetched at the same time have equal logtime"
> six more words: "been staring me in the face"

In other words: "I love advanced S2 customization -- hi $entry.system_time"

In more words than that, but you need to be able to see ICs in Syn, requests #800206 and #796055

And with that, I think I've investigated/ICed on/touched the last really stale request that I can. The rest of the things I can handle are only about a week old or less.

(Now it's time for me to concentrate on the stuff I need for China!)
afuna_archive: (ergo)
Hahaha, I feel immensely silly right now. I've been staring at parsefeed.pl, trying to figure out how two pieces of date-parsing code were related to each other, and I only just realized that one is Atom-specific, the other RSS-specific, and that if you're in an Atom feed, which is what I needed to look at, then you return before you reach the RSS-specific parsing code.

(I thought that there would be another module for parsing RSS, just as there was for parsing Atom and that a certain block was being called when parsing both RSS and Atom *facepalm*)

I've been looking at this (dumbly) since last night.

On the good side, the reason I couldn't find where certain attributes were being handled is because it looks like these attributes aren't being handled at all (the attributes in question being _atom_updated and _atom_published; only _atom_modified and _atom_created are being processed).

I wonder why the Atom parser doesn't use w3cdtf_to_time. Are Atom feeds guaranteed to more strictly follow a certain format?

Must recheck my results in this request later, but it looks right to me. If it pans out, then RT.
afuna_archive: (Default)
I find it interesting that servers respond to the Range header in so many different ways.

My newest toy is a simple Perl script which I wrote to test why certain feeds can't be syndicated on LJ. I've only had a chance to use it twice, but both times the feed couldn't be created because the Range header makes the remote server return 206 (Partial Content) instead of 200 (OK), and each server had a different idea of how it should serve up partial content. And to confuse things even more, some servers return 200(OK) even when the Range header is specified.

First request I tried it with, the remote server returned the feed as a gzipped stream. Second request I tried it with, the server couldn't handle a byte range beyond a certain size (the size of the feed, maybe?); anything larger than that, and it returned 416 (Requested Range Not Satisfiable). I know that the problems with Bad Behaviour are also caused by the range header, but I am not certain about the specifics, other than that it's rejecting the requests because the byte range starts with a 0.

I really should drop this, since there's literally nothing I can do about any of it and there are more productive things I could be doing right now. But. Maybe I'll just check out one more feed if another request about failed feed creation comes in, even if finding out what's wrong isn't likely to help get the feed syndicated.

Enough synning for today! I'm jumping ahead of myself again. I have yet to answer a feed change request where the answer is a simple "yep, valid. Feed changed"; I should get on that soon ;)

Profile

afuna_archive: (Default)
afuna_archive

June 2009

S M T W T F S
  1 2 3 4 5 6
7 8 9 10 11 1213
14151617181920
21222324252627
282930    

Syndicate

RSS Atom

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags
Page generated Jul. 6th, 2025 05:58 pm
Powered by Dreamwidth Studios