Monday 20 January 2014

Counters In MapReduce:"Channel For Statistics Gathering "

In MapReduce counters provide a useful way for gathering statistics about the job and problem diagnosis. Statistics gathering it may be for quality control or for application control. Hadoop has some but-in counters for every job which reports various metrics.

Advantages of using counters:
  • Counter should be used to record whether a particular condition occurred instead of using log message in map or reduce task.
  • Counter values are much easier to retrieve than log output for large distributed jobs.

Disdvantages of using counters:
  • Counters may go down if a task fails during a job run.

Built-in Counters


As mentioned above Hadoop maintains some built in counters for every job. Counters are divided into various groups. Each group either contains task counters which are updated as task progress or job counters which are updated as a job progresses.

Task Counters:It gathers information about the task dividing their entire execution and the results are aggregated over all the tasks in a job.For example MAP_INPUT_RECORDS counter counts the total number of input records for the whole job. It counts the input records read by each map task and aggregates over all map tasks in a job. Task counters are maintained by each task attempt and periodically sent to the tasktracker and then to the jobtracker so they can be globally aggregated (For more info, check YARN:MapRedeuce 2 post's "Progress And Status Update" section).To guard against errors due to lost messages, task counters are sent in full rather than sending the counts after last transmission.

Although counter values give the final value only after the job has finished execution successfully, some counters provide information while job is under execution. This inforamtion is useful to monitor job with web UI. For example, PHYSICAL_MEMORY_BYTES, VIRTUAL_MEMORY_BYTES and COMMITTED_HEAP_BYTES provide an indication of how memory usage varies over the course of a particulaar task attempt.

Job Counters:Job counters are maintained by the jobtracker (or application master in YARN).This is due to the fact that unlike all other counters(including user_defined) they don't need to be sent across the network.They measure job-level statistics , not values that change while a task is running.For example , TOTAL_LUUNCHED_MAPS counts the number of map tasks thet were launcehed over the course of a job including tasks that failed.

User-Defined Java Counters


MapReduce allows user to define a set of counters, which are incremented as required in mapper or reducer. Counters are defined by a Java enum which serves for group related counters. A job may define an arbitrary number of enums, each with an arbitrary number of fields. The name of the enum is the group name, and the enum’s fields are the counter names. Counters are global: the MapReduce framework aggregates them across all maps and reduces to produce a grand total at the end of the job.

Dynamic counters: The code makes use of a dynamic counter—one that isn’t defined by a Java enum. Because a Java enum’s fields are defined at compile time, you can’t create new counters on the fly using enums. Here we want to count the distribution of temperature quality codes, and though the format specification defines the values that the temperature quality code can take, it is more convenient to use a dynamic counter to emit the values that it actually takes.

The method we use on the Reporter object takes a group and counter name using String names: public void incrCounter(String group, String counter, long amount) The two ways of creating and accessing counters—using enums and using strings— are actually equivalent because Hadoop turns enums into strings to send counters over RPC. Enums are slightly easier to work with, provide type safety, and are suitable for most jobs. For the odd occasion

41 comments:

  1. There are lots of information about hadoop have spread around the web, but this is a unique one according to me. The strategy you have updated here will make me to get to the next level in big data. Thanks for sharing this.

    Best hadoop training institute in chennai
    Hadoop Course in Chennai
    Hadoop training institutes in chennai

    ReplyDelete
    Replies
    1. Hi Aman,


      Gasping at your brilliance! Thanks, a tonne for sharing all that content. Can’t stop reading. Honestly!

      I am getting the below error when tried to install any lib like urllib,urlib2,request etc
      my OS is Windows:7 / 64 Bit .

      A lambda form in python does not have statements as it is used to make new function object and then return them at runtime. It is a single expression anonymous function often used as inline function.

      What could be the reason, can you guys help me?
      Python Code: (Double-click to select all)
      1 C:\Program Files\Python36\Scripts>pip install urllib
      Error:
      Traceback (most recent call last):
      File "c:\program files\python36\lib\runpy.py", line 193, in _run_module_as_mai
      n
      "__main__", mod_spec)
      File "c:\program files\python36\lib\runpy.py", line 85, in _run_code
      exec(code, run_globals)
      File "C:\Program Files\Python36\Scripts\pip.exe\__main__.py", line 5, in
      ImportError: cannot import name 'main'

      I read multiple articles and watched many videos about how to use this tool - and was still confused! Your instructions were easy to understand and made the process simple.

      Thanks,
      Kevin

      Delete
  2. Really awesome blog. Your blog is really useful for me. Thanks for sharing this informative blog. Keep update your blog.
    Oracle Training In Chennai

    ReplyDelete
  3. This comment has been removed by the author.

    ReplyDelete
  4. Thanks for sharing this information .You may also refer http://www.s4techno.com/hadoop-training-in-pune/

    ReplyDelete
  5. I read your artical it is very awesome. This blog contain high quality content. I gather lot of information in this blog.Thanks for sharing.keep sharing more blogs.


    Hadoop Training in Bangalore

    ReplyDelete
  6. Useful article.. After reading this i learnt about mapreduce counter concept clearly which helpful to me for develop the hadoop knowledge..

    big data training in chennai | big data training and placement

    ReplyDelete
  7. It's interesting that many of the bloggers to helped clarify a few things for me as well as giving.Most of ideas can be nice content.The people to give them a good shake to get your point and across the command.


    Hadoop Online Training
    R Programming Online Training|
    Data Science Online Training|

    ReplyDelete
  8. Counters In MapReduce:"Channel For Statistics Gathering" is a Very nice blog. Thank you for sharing
    Devops Training in Bangalore
    itEanz

    ReplyDelete
  9. Thank you for the blog:
    Counters In MapReduce:"Channel For Statistics Gathering "
    Keep blogging more
    Devops Training in Bangalore
    Artificial Intelligence Training in Bangalore
    Informatica interview questions

    ReplyDelete
  10. hi ,thanks for posting your blog do keep posting on mapreduce methods in hadoop Hadoop Training in Velachery | Hadoop Training .
    Hadoop Training in Chennai | Hadoop .

    ReplyDelete
  11. Thanks for one marvelous posting! I enjoyed reading it; you are a great author. I will make sure to bookmark your blog and may come back someday. I want to encourage that you continue your great posts.

    Devops training in tambaram"

    Devops training in Sollonganallur"

    Deops training in annanagar"

    Devops training in chennai"

    Devops training in marathahalli"

    Devops training in rajajinagar"

    Devops training in BTM Layout"

    ReplyDelete
  12. This is an awesome post.Really very informative and creative contents. These concept is a good way to enhance the knowledge.I like it and help me to development very well.Thank you for this brief explanation and very nice information.Well, got a good knowledge.

    Java training in Chennai | Java training in USA |

    Java training in Bangalore | Java training in Indira nagar | Java training in Bangalore | Java training in Rajaji nagar

    ReplyDelete
  13. This is most informative and also this post most user friendly and super navigation to all posts... Thank you so much for giving this information to me.. 
    python interview questions and answers
    python tutorials
    python course institute in electronic city

    ReplyDelete
  14. This comment has been removed by the author.

    ReplyDelete
  15. I like your post very much.
    Python has been the top most powerful and flexible open source language that is really very easy to learn. In our instructor based Python Training in Bangalore Advance Level we will teach you how to use the powerful libraries for data analysis and manipulation https://indiancybersecuritysolutions.com/python-training-in-bangalore-advance-level/

    ReplyDelete
  16. I Got Job in my dream company with decent 12 Lacks Per Annum salary, I have learned this world most demanding course out there in the current IT Market from the big data hadoop training in pune experts who helped me a lot to achieve my dreams comes true. Really worth trying

    ReplyDelete
  17. Great Article. As I read the blog I felt a tug on the heartstrings. it exhibits how much effort has been put into this.
    IEEE Projects for CSE in Big Data

    Spring Framework Corporate TRaining

    Final Year Project Centers in Chennai

    JavaScript Training in Chennai

    ReplyDelete
  18. Thanks for sharing your innovative ideas to our vision. I have read your blog and I gathered some new information through your blog. Your blog is really very informative and unique. Keep posting like this. Awaiting for your further update. If you are looking for any Python programming related information, please visit our website python training institute in Bangalore

    ReplyDelete
  19. It is perfect time to make some plans for the future and it is time to be happy. I’ve read this post and if I could I desire to suggest you few interesting things or tips.highly informative and professionally written and I am glad to be a visitor of this perfect blog, thank youJava training in Chennai

    Java Online training in Chennai

    Java Course in Chennai

    Best JAVA Training Institutes in Chennai

    Java training in Bangalore

    Java training in Hyderabad

    Java Training in Coimbatore

    Java Training

    Java Online Training

    ReplyDelete
  20. It is perfect time to make some plans for the future and it is time to be happy. I’ve read this post and if I could I desire to suggest you few interesting things or tips.highly informative and professionally written and I am glad to be a visitor of this perfect blog, thank you
    selenium training in chennai

    selenium training in chennai

    selenium online training in chennai

    software testing training in chennai

    selenium training in bangalore

    selenium training in hyderabad

    selenium training in coimbatore

    selenium online training

    selenium training

    ReplyDelete
  21. This information is impressive; I am inspired with your post writing style & how continuously you describe this topic. After reading your post, thanks for taking the time to discuss this, I feel happy about it and I love learning more about this topic.

    angular js training in chennai

    angular training in chennai

    angular js online training in chennai

    angular js training in bangalore

    angular js training in hyderabad

    angular js training in coimbatore

    angular js training

    angular js online training


    ReplyDelete
  22. Thanks for sharing an informative blog keep rocking bring more details.I like the helpful info you provide in your articles. I’ll bookmark your weblog and check again here regularly. I am quite sure I will learn much new stuff right here! Good luck for the next!


    Azure Training in Chennai

    Azure Training in Bangalore

    Azure Training in Hyderabad

    Azure Training in Pune

    Azure Training | microsoft azure certification | Azure Online Training Course

    Azure Online Training

    ReplyDelete
  23. Nice blog, it’s Hadoop so knowledgeable, informative, and good looking site. I appreciate your hard work. Good job. Thank you for this wonderful sharing with us. Keep Sharing.
    DevOps Training in Chennai

    DevOps Online Training in Chennai

    DevOps Training in Bangalore

    DevOps Training in Hyderabad

    DevOps Training in Coimbatore

    DevOps Training

    DevOps Online Training

    ReplyDelete
  24. I was just wondering how I missed this article so far, this is a great piece of content I have ever seen in the entire Internet. Thanks for sharing this worth able information in here and do keep blogging like this.


    AWS Course in Bangalore

    AWS Course in Hyderabad

    AWS Course in Coimbatore

    AWS Course

    AWS Certification Course

    AWS Certification Training

    AWS Online Training

    AWS Training

    ReplyDelete
  25. Hi am Divya Am really impressed about this blog because this blog is very easy to learn and understand clearly.This blog is very useful for the college students and researchers to take a good notes in good manner.
    for more...
    Data Science Training In Chennai

    Data Science Online Training In Chennai

    Data Science Training In Bangalore

    Data Science Training In Hyderabad

    Data Science Training In Coimbatore

    Data Science Training

    Data Science Online Training

    ReplyDelete