Project

General

Profile

Feature #3166

Fixing queries in the Search tool

Added by Julio Montoya about 9 years ago. Updated almost 8 years ago.

Status:
Bug resolved
Priority:
Normal
Category:
-
Target version:
Start date:
29/03/2011
Due date:
% Done:

100%

Estimated time:
Spent time:
Complexity:
Normal
SCRUM pts - complexity:
?

Description

I found A LOT of warning messages when I started to debug the search tool, I started with the upload document

Associated revisions

Revision bd1ab08d (diff)
Added by Julio Montoya about 9 years ago

Showing search settings status when trying to use the search module see #3166

Revision 4807477e (diff)
Added by Yannick Warnier almost 8 years ago

Fixing fulltext indexer tool - refs #3166

History

#1

Updated by Julio Montoya about 9 years ago

Ok I found many problems in this installation: the tools is not very well documented, and I just realizead that we need some extra tools in order to parse the html, txt, word, documents!!!

That's why in one of our test (with Yannick) the thing didn't work:

As you can see here there are alot of exec functions ...

I will update the configuration settings in order to tell the admin that he need those packages ...


function get_text_content($doc_path, $doc_mime) {
        // TODO: review w$ compatibility

        // Use usual exec output lines array to store stdout instead of a temp file
        // because we need to store it at RAM anyway before index on DokeosIndexer object
        $ret_val = null;
        switch ($doc_mime) {
            case 'text/plain':
                $handle = fopen($doc_path, 'r');
                $output = array(fread($handle, filesize($doc_path)));
                fclose($handle);
                break;
            case 'application/pdf':
                exec("pdftotext $doc_path -", $output, $ret_val);
                break;
            case 'application/postscript':
                $temp_file = tempnam(sys_get_temp_dir(), 'chamilo');
                exec("ps2pdf $doc_path $temp_file", $output, $ret_val);
                if ($ret_val !== 0) { // shell fail, probably 127 (command not found)
                    return false;
                }
                exec("pdftotext $temp_file -", $output, $ret_val);
                unlink($temp_file);
                break;
            case 'application/msword':
                exec("catdoc $doc_path", $output, $ret_val);
                //var_dump($output);
                break;
            case 'text/html':
                exec("html2text $doc_path", $output, $ret_val);
                break;
            case 'text/rtf':
                // Note: correct handling of code pages in unrtf
                // on debian lenny unrtf v0.19.2 can not, but unrtf v0.20.5 can
                exec("unrtf --text $doc_path", $output, $ret_val);
                if ($ret_val == 127) { // command not found
                    return false;
                }
                // Avoid index unrtf comments
                if (is_array($output) && count($output) > 1) {
                    $parsed_output = array();
                    foreach ($output as & $line) {
                        if (!preg_match('/^###/', $line, $matches)) {
                            if (!empty($line)) {
                                $parsed_output[] = $line;
                            }
                        }
                    }
                    $output = $parsed_output;
                }
                break;
            case 'application/vnd.ms-powerpoint':
                exec("catppt $doc_path", $output, $ret_val);
                break;
            case 'application/vnd.ms-excel':
                exec("xls2csv -c\" \" $doc_path", $output, $ret_val);
                break;
        }

#3

Updated by Julio Montoya about 9 years ago

  • % Done changed from 0 to 50
#4

Updated by Yannick Warnier about 9 years ago

  • Target version set to 1.8.8 stable

Seems really great right now as it is. For some reason, the indexing works but not the search on http://chamilodev.beeznest.com, but I'm pretty sure that was an install bug.

As far as I know, this is pretty much complete.

#5

Updated by Julio Montoya about 9 years ago

  • Status changed from New to Needs more info
  • Target version changed from 1.8.8 stable to 1.9 Stable

moving to 1.8.9 not so important, is working at least in my local installation ...

#6

Updated by Yannick Warnier about 8 years ago

  • Target version changed from 1.9 Stable to 1.9 Beta
#7

Updated by Yannick Warnier almost 8 years ago

  • Target version changed from 1.9 Beta to 1.9 RC1
#8

Updated by Yannick Warnier almost 8 years ago

  • Status changed from Needs more info to Bug resolved
  • Assignee set to Yannick Warnier
  • % Done changed from 50 to 100

Fixed the search tool, changed names of libraries, changed course ID to build URL.

Also available in: Atom PDF