Uploaded image for project: 'A/B Testing'
  1. A/B Testing
  2. ABTEST-109

Research to find alternatives for Apache Hive

    XMLWordPrintable

Details

    • Story
    • Resolution: Unresolved
    • Minor
    • None
    • None
    • None
    • 8

    Description

      Background:

      We intend to use Apache Hive to do calculations on the test data, however, most likely there are better technologies. We should list the pros/cons of replacements of the computation layer.

       

      To be researched technologies should have the following capabilities:

      • UDF (user-defined functions)
      • Ability to trigger at least per day (one way to integrate with AWS Data Pipeline or similar)
        • Should be possible to do per hour as well if necessary in the long run
      • Scalability
      • Ideally not high priced
      • Data stored in S3 or DynamoDB

       

      AC

      • Research alternatives to Apache Hive
        • Apache Spark
        • Amazon Athena
        • Presto
        • Apache Flink
        • Others?
      • Cassandra as an alternative to whole system
      • Prepare a document which includes at least the following:
        • Pros/Cons
        • Price
        • Performance comparison

      Checklists

        Acceptance criteria

        Attachments

          Activity

            People

              Unassigned Unassigned
              ilgun Ilgun Ilgun
              AuthorX
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:

                Checklists

                  Task DoD