tag:blogger.com,1999:blog-2490461557022995383.comments2022-11-30T00:07:50.094-08:00Kenshoo Tech BlogAnonymoushttp://www.blogger.com/profile/12074100082315456990noreply@blogger.comBlogger7125tag:blogger.com,1999:blog-2490461557022995383.post-79684955407341467412016-12-08T11:25:36.678-08:002016-12-08T11:25:36.678-08:00Thank you for the reply. Thank you for the reply. Anonymoushttps://www.blogger.com/profile/17238980384308373469noreply@blogger.comtag:blogger.com,1999:blog-2490461557022995383.post-32319684000156467062016-11-30T01:01:45.502-08:002016-11-30T01:01:45.502-08:00Good question - it doesn't, really. We simply ...Good question - it doesn't, really. We simply accept it, assuming that failures are rare (they are, in our case) and these metrics are used for monitoring (and not, say, billing) which makes rare errors insignificant.Anonymoushttps://www.blogger.com/profile/12074100082315456990noreply@blogger.comtag:blogger.com,1999:blog-2490461557022995383.post-71293333335275194552016-11-28T13:58:01.225-08:002016-11-28T13:58:01.225-08:00How does this approach handles the fact that accum...How does this approach handles the fact that accumulators are by design not accurate? <br />You increment your accumulators while doing a transformation, this might cause a duplicate counting on stage failures.Anonymoushttps://www.blogger.com/profile/17238980384308373469noreply@blogger.comtag:blogger.com,1999:blog-2490461557022995383.post-7084760884369862042016-08-01T02:06:34.609-07:002016-08-01T02:06:34.609-07:00Yes - this solution means that the application is ...Yes - this solution means that the application is halted until the disk is fixed, or REMOVED (excluded from YARN's local-disks configuration), which in our case is done quickly (minutes) by the on-call engineer. Increasing RAM won't help much, I think, as it won't prevent Spark from using the local disks for cross-node shuffling. As far as I know, some operations go to disk regardless of RAM availability, but I might be wrong on that.Anonymoushttps://www.blogger.com/profile/12074100082315456990noreply@blogger.comtag:blogger.com,1999:blog-2490461557022995383.post-2620589754175357192016-07-31T08:09:27.751-07:002016-07-31T08:09:27.751-07:00So until that faulty local disk is fixed, the whol...So until that faulty local disk is fixed, the whole cluster is halted?<br />Btw, would increasing RAM significantly on each node force spark off using its local disk?Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-2490461557022995383.post-3631676690465280722016-02-19T04:15:53.289-08:002016-02-19T04:15:53.289-08:00Great, Really useful.
Thanks!Great, Really useful.<br /><br /> Thanks!Anonymoushttps://www.blogger.com/profile/05529191855645542474noreply@blogger.comtag:blogger.com,1999:blog-2490461557022995383.post-69930153642471903772015-11-16T22:43:30.058-08:002015-11-16T22:43:30.058-08:00This comment has been removed by a blog administrator.Avihayhttps://www.blogger.com/profile/04592770201813100738noreply@blogger.com