发新话题
打印

超越轮询?考虑PubSub、Push和MOM

超越轮询?考虑PubSub、Push和MOM

社区
    Architecture,
    SOA
主题
    数据访问,
    消息传送,
    性能和可伸缩性
标签
    XMPP,
    AMQP

在OSCON '08大会上,Evan 'Rabble' Henshaw-Plath和Kellan Elliott-McCrea介绍了《超越REST?使用XMPP PubSub构建数据服务》。Robert Kaye对该幻灯片的报道如下:

    Kellan谈到了FriendFeed,它是一个让使用者知道他们的朋友共享了新项目的网站。在这个例子中,Kellan指出:为了给45000名使用者检查更新,FriendFeed向Flickr轮询了290万次。而且在这45000名使用者中,在任意时刻只有6700名处于已登录状态。这当然是一种蹩脚的内容更新检查方式。Kellan说道:“轮询太逊了!”

    解决这个问题的关键在于将标准REST Web服务抛在脑后,找出一种使用消息传递的方式,它是一种通知使用者内容变更的直接通信方式。

在这个上下文中,轮询意味着使用一种RESTful Web服务来为每个使用者获得(GET)更新。与之相反,PubSub(Publish/Subscribe,发布/订阅)是一种使用异步消息传递协议的架构性方法,其中发布者和任何订阅者都是解耦的。在需要向大量客户端发送更新通知的场合下,这些特征使得PubSub成为一种合适的可伸缩性选择。

在这个幻灯片中,Evan和Rabble描述了Jabber(一种基于XMPP[Extensible Messaging and Presence Protocol,可扩展消息传递和现场协议]的PubSub服务)的各种优势:

       1. XMPP工作在持久化连接之上
       2. 它是有状态的(SSL变得便宜)
       3. 被设计成为一个事件流协议
       4. 联邦和异步的天性
       5. 内置身份认证、安全和现场(presence)
       6. 为此构建和部署了Jabber服务器

该幻灯片引起了巨大反响。Kirk Wylie认为像AMQP这样的基于MOM(Mesage Oriented Middleware,面向消息的中间件)的系统才是这儿真正需要的,而Joshua Schacter(del.icio.us创始人)也加入了论战的行列,并指出了一种更简单的方法。这个方法使用基于HTTP回调的技术:

    简而言之,与频繁地轮询不同,客户端可以发送一个包含欲订阅资源(resouce)和交付更新的端点(endpoint)的普通HTTP请求到: http://your.app/subscribe?resour ... p://my.app/endpoint

    这样推断,端点当且仅当资源更新时才会收到RSS条目片断。出于安全考虑,交换要包含某种令牌,它可向合适的协议借鉴。订阅将在一段时间后失效,比方说,24小时,或将其作为一个参数传入。

评论者指出,这种系统早就存在了:Webhooks。

Rabble提供了他关于使用回调的思考:

    那么,这里有两个问题,我们曾经考虑加入一页幻灯片来讨论ping back系统。显然,能够进行ping back/web hook似乎是个好模式,但也可能是反模式?

    可以预见,创建一个大规模web hook系统,你将面临聚合器(aggregator)/爬行器(crawler)问题。使用XMPP绝对可行。这样,我们不仅实现了联邦功能,而且有着更好的接口。我们还有潜力可在其之上增加一些代理认证(auth)功能。

诸如Webhooks这类方法潜在地要比使用Jabber/XMPP简单得多,但是Blaine Cook认为这种复杂性是说得通的:

    如果我们声称只需10行PHP脚本就可让系统可用,那么这一过程只能得到一个实现拙劣的输出消息队列。在中等规模(10000使用者,每人有50个联系人,1/5离线,每天2个帖子)情况下,你会看到每秒有2.3个远程HTTP请求,这种方法倒也并非一无是处。

尽管使用PubSub进行通知是一个优秀的架构方法,但是很多人对幻灯片的标题存有疑问。Dare Obasanjo对此概括得非常好,指出REST并非一个“金锤”(译注:比喻,意指手拿锤子,看到任何东西都是钉子。):

    [因而],这个例子并非是指REST象Evan和Kellan的讲演标题暗示的那样不能进行伸缩,而是一个使用错误工具解决问题的例子。选择它的原因在于它碰巧能在其他场合很好的工作。

即使没有其他内容,这次讨论也突现了这一事实,考虑API消费者的使用模式在确定合适设计的过程中扮演了举重轻重的角色。

TOP

At OSCON '08, Evan 'Rabble' Henshaw-Plath and Kellan Elliott-McCrea presented "Beyond REST? Building Data Services with XMPP PubSub". Robert Kaye reported on the presentation:

    Kellan talked about FriendFeed, a site that lets their users know when their friends share new items. In this example, Kellan pointed out that FriendFeed polls Flickr 2.9 million times in order to check on updates for 45 thousand users. And of those 45 thousand users, only 6.7 thousand are logged in at any one time. This of course, its a poor way of checking for changed content. Kellan says: "Polling sucks!"

    To solve this problem its key to leave standard REST web services behind and find a way to use message passing, which is a direct communication way of notifying users of changed content.

Polling, in this context, means using a RESTful webservice to GET updates for each user. In contrast, PubSub (Publish/Subscribe) is an architectural approach that uses an asynchronous message passing protocol where publishers are decoupled from any subscribers. These characteristics make PubSub a scalable choice for scenarios where update notifications need to be sent to a large number of clients.

In the presentation, Evan and Rabble described the advantages of Jabber - a PubSub service based on XMPP (Extensible Messaging and Presence Protocol):

       1. XMPP works over persistent connections
       2. It it stateful (SSL becomes cheap)
       3. Designed as an event stream protocol
       4. Natively federated and asynchronous
       5. Identity, security and presence are built in.
       6. Jabber servers are built and deployed to do this stuff.

The presentation generated a lot of discussion. Kirk Wylie suggested that a MOM (Mesage Oriented Middleware) based system such as AMQP is really what is needed here, while Joshua Schacter (del.icio.us founder) added his voice to the debate by pointing out that a simpler approach based on HTTP Callbacks could be used:

    Simply described, instead of polling frequently, a client would send a normal HTTP request with the resource to be subscribed to and an endpoint to deliver updates to: http://your.app/subscribe?resour ... p://my.app/endpoint

    Presumably the endpoint would then receive RSS item fragments when and only when that resource updated. For security, the exchange should include some kind of token, borrowing from the appropriate protocols. The subscription would lapse after, say, 24 hours, or that could be passed in as a parameter.

Commentors pointed out that such a system already exists: Webhooks.

Rabble provided his thoughts on using Callbacks:

    So there are a couple things, the ping back system is something which we had thought about putting a slide in for. Clearly being able to do ping backs / web hooks seems like a good pattern, maybe anti-pattern?

    It works out that creating a big scale web hooks system you end up with aggregator / crawler problems. Definitely doable. With XMPP we get that functionality with federation and a nicer interface. We also can potentially add some delegated auth over it.

An approach such as Webhooks is potentially a lot simpler than using Jabber/XMPP, however Blaine Cook thought that the complexity is warranted:

    If we're arguing that this system needs to be usable by 10-line PHP scripts, then a poorly implemented outgoing message queue is par for the course. At a moderate scale (10000 users, 50 contacts each, 1/5 off-site, 2 posts per day), you're looking at 2.3 remote HTTP requests per second, which isn't nothing.

While using PubSub for notifications is a good architectural approach, many took issue with the title of the presentation. Dare Obasanjo summed it up best, pointing out that REST is not a Golden Hammer:

    [Thus] this isn't a case of REST not scaling as implied by Evan and Kellan's talk. This is a case of using the wrong tool to solve your problem because it happens to work well in a different scenario.

If nothing else, the discussion has highlighted the fact that consideration of the usage patterns of API consumers plays a large part in determining an appropriate design.

TOP

发新话题