Project

General

Profile

Bug #6640

Users insertion through CSV is taking ages

Added by Yannick Warnier almost 8 years ago. Updated almost 8 years ago.

Status:
Needs more info
Priority:
Normal
Assignee:
Category:
Administration
Target version:
Start date:
27/08/2013
Due date:
% Done:

0%

Estimated time:
Spent time:
Complexity:
Normal
SCRUM pts - complexity:
?

Description

This is true for 1.9 and 1.10. Importing a CSV list of users is taking a huge time, and I've observed cases where web services importation takes 0.5s on a 16-cores machine with 32GB of RAM.

There's definitely something wrong about users importation. It appears to be linked to the number of checks that are made based on the data available in the CSV. I'm attaching a demo CSV of 50,000 users. It just took me 6h to import 2000 users on an i7-processor laptop with 12GB of RAM, and about the same time to import 13,000 users on a hosted, single core virtual machine by removing the calls to event_system() and the checks on whether to send e-mails or not.

Note that in all cases, I unselect the "send e-mail to new user" box, so there's no real e-mail sending involved.


Files

users.csv (2.96 MB) users.csv Yannick Warnier, 27/08/2013 16:27
users10.csv (601 Bytes) users10.csv 10 users Yannick Warnier, 27/08/2013 21:57
users100.csv (6.58 KB) users100.csv 100 users Yannick Warnier, 27/08/2013 21:57
users1000.csv (66.5 KB) users1000.csv 1000 users Yannick Warnier, 27/08/2013 21:57
users10000.csv (666 KB) users10000.csv 10000 users Yannick Warnier, 27/08/2013 21:57

History

#1

Updated by Yannick Warnier almost 8 years ago

  • Assignee deleted (Julio Montoya)

Sorry, I didn't really assign that to you, but from the category it got assigned. I owe apologies to a few Redmine users now... :-)

#2

Updated by Yannick Warnier almost 8 years ago

Additionally, it seems to create some kind of a loop. I haven't been able to identify it yet, but I just got my disk filled with 25GB of data, which were all freed when I stopped Apache, and this was the only notable process on my computer (!?)

#3

Updated by Yannick Warnier almost 8 years ago

I just realized 6h for 2000 users is like one user every 10 seconds... I can do it faster than a core i7! :-)

#4

Updated by Yannick Warnier almost 8 years ago

I made some additional tests with the different files attached (each of them mentions a number of users it contains).
The problem only appears when using the 10000 users file (not when just 1000 or under).

The errors reported are PHP notices:

[Tue Aug 27 14:53:18 2013] [error] [client 127.0.0.1] PHP Notice:  Undefined index: OfficialCode in /var/www/minedu-chamilo/main/admin/user_import.php on line 145, referer: http://my.minedu.net/web/main/admin/user_import.php
[Tue Aug 27 14:53:18 2013] [error] [client 127.0.0.1] PHP Stack trace:, referer: http://my.minedu.net/web/main/admin/user_import.php
[Tue Aug 27 14:53:18 2013] [error] [client 127.0.0.1] PHP   1. {main}() /var/www/minedu-chamilo/web/index.php:0, referer: http://my.minedu.net/web/main/admin/user_import.php
[Tue Aug 27 14:53:18 2013] [error] [client 127.0.0.1] PHP   2. Silex\\Application->run() /var/www/minedu-chamilo/web/index.php:33, referer: http://my.minedu.net/web/main/admin/user_import.php
[Tue Aug 27 14:53:18 2013] [error] [client 127.0.0.1] PHP   3. Silex\\Application->handle() /var/www/minedu-chamilo/vendor/silex/silex/src/Silex/Application.php:479, referer: http://my.minedu.net/web/main/admin/user_import.php
[Tue Aug 27 14:53:18 2013] [error] [client 127.0.0.1] PHP   4. Symfony\\Component\\HttpKernel\\HttpKernel->handle() /var/www/minedu-chamilo/vendor/silex/silex/src/Silex/Application.php:502, referer: http://my.minedu.net/web/main/admin/user_import.php
[Tue Aug 27 14:53:18 2013] [error] [client 127.0.0.1] PHP   5. Symfony\\Component\\HttpKernel\\HttpKernel->handleRaw() /var/www/minedu-chamilo/vendor/symfony/http-kernel/Symfony/Component/HttpKernel/HttpKernel.php:61, referer: http://my.minedu.net/web/main/admin/user_import.php
[Tue Aug 27 14:53:18 2013] [error] [client 127.0.0.1] PHP   6. call_user_func_array() /var/www/minedu-chamilo/vendor/symfony/http-kernel/Symfony/Component/HttpKernel/HttpKernel.php:117, referer: http://my.minedu.net/web/main/admin/user_import.php
[Tue Aug 27 14:53:18 2013] [error] [client 127.0.0.1] PHP   7. ChamiloLMS\\Controller\\LegacyController->classicAction() /var/www/minedu-chamilo/vendor/symfony/http-kernel/Symfony/Component/HttpKernel/HttpKernel.php:117, referer: http://my.minedu.net/web/main/admin/user_import.php
[Tue Aug 27 14:53:18 2013] [error] [client 127.0.0.1] PHP   8. require_once() /var/www/minedu-chamilo/src/ChamiloLMS/Controller/LegacyController.php:58, referer: http://my.minedu.net/web/main/admin/user_import.php
[Tue Aug 27 14:53:18 2013] [error] [client 127.0.0.1] PHP   9. save_data() /var/www/minedu-chamilo/main/admin/user_import.php:343, referer: http://my.minedu.net/web/main/admin/user_import.php
[Tue Aug 27 14:53:18 2013] [error] [client 127.0.0.1] PHP Notice:  Undefined index: language in /var/www/minedu-chamilo/main/admin/user_import.php on line 145, referer: http://my.minedu.net/web/main/admin/user_import.php
[Tue Aug 27 14:53:18 2013] [error] [client 127.0.0.1] PHP Stack trace:, referer: http://my.minedu.net/web/main/admin/user_import.php
[Tue Aug 27 14:53:18 2013] [error] [client 127.0.0.1] PHP   1. {main}() /var/www/minedu-chamilo/web/index.php:0, referer: http://my.minedu.net/web/main/admin/user_import.php
[Tue Aug 27 14:53:18 2013] [error] [client 127.0.0.1] PHP   2. Silex\\Application->run() /var/www/minedu-chamilo/web/index.php:33, referer: http://my.minedu.net/web/main/admin/user_import.php
[Tue Aug 27 14:53:18 2013] [error] [client 127.0.0.1] PHP   3. Silex\\Application->handle() /var/www/minedu-chamilo/vendor/silex/silex/src/Silex/Application.php:479, referer: http://my.minedu.net/web/main/admin/user_import.php
[Tue Aug 27 14:53:18 2013] [error] [client 127.0.0.1] PHP   4. Symfony\\Component\\HttpKernel\\HttpKernel->handle() /var/www/minedu-chamilo/vendor/silex/silex/src/Silex/Application.php:502, referer: http://my.minedu.net/web/main/admin/user_import.php
[Tue Aug 27 14:53:18 2013] [error] [client 127.0.0.1] PHP   5. Symfony\\Component\\HttpKernel\\HttpKernel->handleRaw() /var/www/minedu-chamilo/vendor/symfony/http-kernel/Symfony/Component/HttpKernel/HttpKernel.php:61, referer: http://my.minedu.net/web/main/admin/user_import.php
[Tue Aug 27 14:53:18 2013] [error] [client 127.0.0.1] PHP   6. call_user_func_array() /var/www/minedu-chamilo/vendor/symfony/http-kernel/Symfony/Component/HttpKernel/HttpKernel.php:117, referer: http://my.minedu.net/web/main/admin/user_import.php
[Tue Aug 27 14:53:18 2013] [error] [client 127.0.0.1] PHP   7. ChamiloLMS\\Controller\\LegacyController->classicAction() /var/www/minedu-chamilo/vendor/symfony/http-kernel/Symfony/Component/HttpKernel/HttpKernel.php:117, referer: http://my.minedu.net/web/main/admin/user_import.php
[Tue Aug 27 14:53:18 2013] [error] [client 127.0.0.1] PHP   8. require_once() /var/www/minedu-chamilo/src/ChamiloLMS/Controller/LegacyController.php:58, referer: http://my.minedu.net/web/main/admin/user_import.php
[Tue Aug 27 14:53:18 2013] [error] [client 127.0.0.1] PHP   9. save_data() /var/www/minedu-chamilo/main/admin/user_import.php:343, referer: http://my.minedu.net/web/main/admin/user_import.php
[Tue Aug 27 14:53:18 2013] [error] [client 127.0.0.1] PHP Notice:  Undefined index: PhoneNumber in /var/www/minedu-chamilo/main/admin/user_import.php on line 145, referer: http://my.minedu.net/web/main/admin/user_import.php
[Tue Aug 27 14:53:18 2013] [error] [client 127.0.0.1] PHP Stack trace:, referer: http://my.minedu.net/web/main/admin/user_import.php
[Tue Aug 27 14:53:18 2013] [error] [client 127.0.0.1] PHP   1. {main}() /var/www/minedu-chamilo/web/index.php:0, referer: http://my.minedu.net/web/main/admin/user_import.php
[Tue Aug 27 14:53:18 2013] [error] [client 127.0.0.1] PHP   2. Silex\\Application->run() /var/www/minedu-chamilo/web/index.php:33, referer: http://my.minedu.net/web/main/admin/user_import.php
[Tue Aug 27 14:53:18 2013] [error] [client 127.0.0.1] PHP   3. Silex\\Application->handle() /var/www/minedu-chamilo/vendor/silex/silex/src/Silex/Application.php:479, referer: http://my.minedu.net/web/main/admin/user_import.php
[Tue Aug 27 14:53:18 2013] [error] [client 127.0.0.1] PHP   4. Symfony\\Component\\HttpKernel\\HttpKernel->handle() /var/www/minedu-chamilo/vendor/silex/silex/src/Silex/Application.php:502, referer: http://my.minedu.net/web/main/admin/user_import.php
[Tue Aug 27 14:53:18 2013] [error] [client 127.0.0.1] PHP   5. Symfony\\Component\\HttpKernel\\HttpKernel->handleRaw() /var/www/minedu-chamilo/vendor/symfony/http-kernel/Symfony/Component/HttpKernel/HttpKernel.php:61, referer: http://my.minedu.net/web/main/admin/user_import.php
[Tue Aug 27 14:53:18 2013] [error] [client 127.0.0.1] PHP   6. call_user_func_array() /var/www/minedu-chamilo/vendor/symfony/http-kernel/Symfony/Component/HttpKernel/HttpKernel.php:117, referer: http://my.minedu.net/web/main/admin/user_import.php
[Tue Aug 27 14:53:18 2013] [error] [client 127.0.0.1] PHP   7. ChamiloLMS\\Controller\\LegacyController->classicAction() /var/www/minedu-chamilo/vendor/symfony/http-kernel/Symfony/Component/HttpKernel/HttpKernel.php:117, referer: http://my.minedu.net/web/main/admin/user_import.php
[Tue Aug 27 14:53:18 2013] [error] [client 127.0.0.1] PHP   8. require_once() /var/www/minedu-chamilo/src/ChamiloLMS/Controller/LegacyController.php:58, referer: http://my.minedu.net/web/main/admin/user_import.php
[Tue Aug 27 14:53:18 2013] [error] [client 127.0.0.1] PHP   9. save_data() /var/www/minedu-chamilo/main/admin/user_import.php:343, referer: http://my.minedu.net/web/main/admin/user_import.php

These are the most stupid undefined index errors, but it doesn't explain why these only appear starting at 10,000. Maybe something is wrong in the file, and Chamilo starts expecting these fields although their corresponding columns are not in the CSV...?

#5

Updated by Yannick Warnier almost 8 years ago

The 25GB of space on my disk were actually 59GB, but they were due to tracing being enabled on my system (xdebug).

#6

Updated by Yannick Warnier almost 8 years ago

I'm seeing a 20% speed increase just by commenting the two lines corresponding to the user object logging with event_system:

            //$user_info = api_get_user_info($return);
            event_system(LOG_USER_CREATE, LOG_USER_ID, $return, api_get_utc_datetime(), $user_id_manager);
            //event_system(LOG_USER_CREATE, LOG_USER_OBJECT, $user_info, api_get_utc_datetime(), $user_id_manager);

Also, I've changed a little the insert query (used intval() instead of Database::escape_string() in several places and removed useless concatenations) and have saved 0.0004s per user average. On 50,000 users, this should save us 20s!

#7

Updated by Yannick Warnier almost 8 years ago

The $em->flush() is what takes the most time by itself now. All the rest seems pretty reasonnable. The insert query still takes around 0.033s to execute (about a fifth of the whole function's execution time), but we're already at around 0.05 less by user without having to change any of the inserted data.

I've added a "reliable_data" param (false by default) at the end of the params list of create_user() to avoid filtering (and save time) for all those values if we are sure they are safe.

#8

Updated by Yannick Warnier almost 8 years ago

Results of little tests:
  • insertion of 100 users without filtering (reliable_data) without users_roles at all: 12s (50%)
  • insertion of 100 users without filtering (reliable_data) with users_roles set to $user,$status, manually: 14s (58%)
  • insertion of 100 users without filtering (reliable_data) with users_roles set to $user,$status, manually but still with the check for admin user using entities: 16s (67%)
  • insertion of 100 users without filtering (reliable_data) without users_roles through $em: 24s (100%)

Julio, can we skip the use of the $em object here, at least for now?

#9

Updated by Yannick Warnier almost 8 years ago

Changed code for this bit:

/*
            global $app;
            // Adding user
            /** @var Entity\User $user
            $em = $app['orm.ems']['db_write'];
            $user = $em->getRepository('Entity\User')->find($return);
            $role = $em->getRepository('Entity\Role')->find($status);

            if ($role->getRole() == 'ROLE_ADMIN') {
                UserManager::add_user_as_admin($return);
            }
            //->add() takes about no time (0.0003s)
            $user->getRolesObj()->add($role);
            //->persist() takes about no time either
            $em->persist($user);
            $t2 = microtime() - $t0;
            error_log(__LINE__.': '.$t2);
            // this flush takes about 0.05s (20% of whole insert) all by itself
            $em->flush();
*/
            if ($status == 11) {
                UserManager::add_user_as_admin($return);
            }
            $sql = "insert into users_roles values($return,$status)";
            $res = Database::query($sql);

(I know, a bit ugly)

I'm sending this code for our private project for now, leaving it untouched on Chamilo, but if someone is looking for speed of users importation... you know where to look.

#10

Updated by Yannick Warnier almost 8 years ago

  • Status changed from New to Needs more info
  • Assignee set to Julio Montoya
  • Target version changed from 1.9.8 to 2.0

Also available in: Atom PDF