l i n u x - u s e r s - g r o u p - o f - d a v i s
L U G O D
 
Next Meeting:
October 20: Web Application Hacking: How to Make and Break Security on the Web
Next Installfest:
TBD
Latest News:
Oct. 10: LUGOD Installfests coming again soon
Page last updated:
2009 Dec 01 22:12

The following is an archive of a post made to our 'vox-tech mailing list' by one of its subscribers.

Report this post as spam:

(Enter your email address)
Re: [vox-tech] MySQL question: making stats out of orders
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [vox-tech] MySQL question: making stats out of orders



On Tue, Dec 01, 2009 at 05:32:20PM -0800, Bill Kendrick wrote:
> 
> Ok, so I've got a few queries our website does that
> are used to determine which products are 'top sellers'
> in the past 90 days' worth of sales.
> 
> Right now, this involves joining the product table
> with the orders table, by way of an items-in-the-order table.
> 
> In other words, to figure out what sold the best in the
> last 90 days, it's something like:
> 
>   SELECT * FROM products
>   JOIN orderitem ON products.id = orderitem.productid
>   JOIN orders ON orders.id = orderitem.orderid
>   WHERE orders.timestamp >= { 90 days ago }
> 
> As time goes on, our records or orders, order items, and
> actual products all keep growing.  That means this task
> gets more and more intensive.

Is this task getting more intensive? If is (and the number of
transactions per day hasn't grown significantly), then you probably
don't have indices set up correctly.

I think that having an an index on orders.timestamp,
and an index on orderitem.orderid (in addition to the primary keys
products.id and orders.id) should optimize this query, but it always
pays to determine what's going on with EXPLAIN SELECT.

If that still doesn't speed things up enough, maybe consider using a
table with the MERGE storage engine to move old entries from the
orders table and the orderitem table into archives. Then only query
the table full of recent order data for queries, and only query the
other tables in the merge when you need to.

> Now, we've got some statistics tables set up for other
> purposes -- how many times a product was viewed,
> as well as when any particular logged-in-users last
> visited particular items, which allows us to do the
> "people who recently looked at X also recently looked at Y".
> 
> I'm thinking we could do something similar for sales,
> to simplify the "top sellers" queries.  They'd hit
> a single stats table (which could, on a daily basis,
> be cleared of any >90-days-ago entries), rather than
> having to JOIN two large tables that go back to day 1.
> 
> 
> My current problem -- no doubt due to lack of brain power
> at the end of the day -- is how to take the CURRENT data
> from the orders and orderitem tables so that I can do an
> initial population of a new stats table.
> 
> Our "views" stats table, for example, is something
> along the lines of:
> 
>   productid, date, views
> 
> where those first two columns act as a key.  (Date is at the
> 'day' level of granularity.)
> 
> 
> And we count a view like so:
> 
>   INSERT INTO views (productid, date, views) VALUES (###, NOW(), 1)
>   ON DUPLICATE KEY UPDATE views=views+1
> 
> In other words, the first time a product is viewed on any given day,
> we insert a new row, otherwise we update its existing row.
> 
> 
> Anyone feel like flexing their MySQL muscles and provide some
> ideas on how to populate a similar table (tracking sales) based
> on the orders/orderitem table combo explained above?

I'm not really clear on what you're looking for here. We need a better
idea of what stats you're running to know how to structure the table.

It would seem pretty simple to create the analagous table for orders:

    orderstats consisting of productid, date, orders

and every time you complete an order, you run the query:

    insert into orderstats(productid, date, orders) VALUES (###, NOW(),
    QTY) on duplicate key update orders=orders+QTY

-- 
Chanoch (Ken) Bloom. PhD candidate. Linguistic Cognition Laboratory.
Department of Computer Science. Illinois Institute of Technology.
http://www.iit.edu/~kbloom1/
_______________________________________________
vox-tech mailing list
vox-tech@lists.lugod.org
http://lists.lugod.org/mailman/listinfo/vox-tech



LinkedIn
LUGOD Group on LinkedIn
Sign up for LUGOD event announcements
Your email address:
facebook
LUGOD Group on Facebook
'Like' LUGOD on Facebook:

Hosting provided by:
Sunset Systems
Sunset Systems offers preconfigured Linux systems, remote system administration and custom software development.

LUGOD: Linux Users' Group of Davis
PO Box 2082, Davis, CA 95617
Contact Us

LUGOD is a 501(c)7 non-profit organization
based in Davis, California
and serving the Sacramento area.
"Linux" is a trademark of Linus Torvalds.

Sponsored in part by:
EDGE Tech Corp.
For donating some give-aways for our meetings.