htmlArea

A directory of browser-based WYSIWYG editors

  MAIN
INDEX
SEARCH
POSTS
WHO'S
ONLINE
LOG
IN

Home: htmlArea 3 (beta): htmlArea 2 & 3 archive (read only): htmlArea v3.0 - Discussion:
HTMLArea 3.0-Beta PHP SpellChecker 1.0


The htmlArea 2 & 3 editors have been discontinued.

We've made these forums available as a read-only reference and knowledge-base for people using or developing editors based on htmlArea 2 or 3.

Anyone who is interested in taking over version 2 or 3 is free to do so. All we ask is that you choose a new name that doesn't have "htmlarea" in it to avoid confusion with this site. We'll even give you a link in the directory to make it easier for people to find you. If you are developing or hosting an htmlArea based-editor under a new name, please submit it to our directory.

First page Previous page 1 2 Next page Last page  View All


mlall
Novice

Aug 12, 2003, 7:10 PM

Post #1 of 31 (7377 views)
Shortcut
HTMLArea 3.0-Beta PHP SpellChecker 1.0 Can't Post

I just started to work on a PHP version of the CGI for the spellchecker.

Here is so far what I did:

- Change the form action to go to the php instead of the CGI.
- Use the command line version of Aspell so there is no need to install any PHP extension of any kind instead of the PHP pspell extension. The pspell extension doesn't work on Windows (cf. PHP Doc) but the command line version maybe work (I can't test it).
- The list of available dictionaries is static.
- The line of spell-check-logic.php:
"$aspellcommand = 'cat '.$temptext.' | /usr/bin/aspell ..." need to be change according to your server settings.

Known Bug:

- [Fixed: 2003-08-13 (It also fixed the javascript bug because there is no suggestion provided)] Any word without suggestion is completely ignore.
- [Fixed: 2003-08-15] Text inside a link are ignored too (I need to check is it can be fixed: It depend if it's an option of aspell command line).
- It is not UTF8 "aware" Unimpressed (Update 2003-08-15: After looking at that I'm not sure it will be before aspell is itself UTF8 "aware".


PS:
Mishoo: in your perl version of the spell checker, when there is no suggestion for a spelling error, javascript return an error. In this version I add as a suggestion the "misspelled" word so the javascript error is gone.


(This post was edited by mlall on Aug 18, 2003, 12:15 PM)
Attachments: SpellChecker_php_aspell_cli.zip (3.12 KB)


mishoo
User

Aug 13, 2003, 5:23 AM

Post #2 of 31 (7335 views)
Shortcut
Re: [mlall] HTMLArea 3.0-Beta PHP SpellChecker 0.5 [In reply to] Can't Post

This project is welcome ;-) Here's how the Perl script works.

1. It parses the given HTML and calls a function for each chunk of text. It therefore ignores tag names and attributes.
2. It calls spell-checker for each word in a chunk of text, and if found it replaces that word with <span class="HA-spellcheck-error">origword</span><span class="HA-spellcheck-suggestions">suggestion1,suggestion2,etc.</span>
3. It gives the "mangled" HTML back to page.

Notes on the HTML parser:

1. It ignores such text: <span class="HA-spellcheck-fixed">text</span>. That's because this is already fixed in a previous dictionary.
2. Before giving text to spell checker, it does the following:
-- it replaces any known entities (for this Perl really helps, there is a decode_entities function in HTML::Entities). Also it replaces entities like &#DECIMAL; or &#xHEXA; to the appropriate Unicode character.
-- it encodes the Unicode string into an encoding accepted by Aspell for the current dictionary.
3. After spell-checker has done the dirty work, it encodes the text from the Aspell encoding back to Unicode.

Also one note about calling aspell in command line mode. I could have done so too, but using the Text::Aspell module I have increased performance. That module is probably using command line too (or the libaspell.so library), but it only instantiates the spell checker once, also reads the dictionary only once. So theoretically, calling aspell in command line mode for each word could take much longer.

Suggestion. It's probably possible to open a bidirectional pipe from PHP, so do so only once when the program starts, then you feed the output pipe with words for Aspell, and you read the input pipe with Aspell's suggestions ;-)

Anyway, hope all this info helps. If your module will work it'll probably get into the main distro ;-) you have been warned. There are many people which would prefer PHP code instead of Perl. I personally prefer Perl, from a "programming-language-point-of-view". ;-)
--
Mihai Bazon,
dynarch.com
Applied Web Standards


mlall
Novice

Aug 13, 2003, 10:23 AM

Post #3 of 31 (7318 views)
Shortcut
Re: [mishoo] HTMLArea 3.0-Beta PHP SpellChecker 0.5 [In reply to] Can't Post

Thanks for the information.

I choose to call Aspell only once for the complete text, so I have only shell call to do.
If you send a file to aspell on command line it will spell the complete document and so you just need to analyze the result of aspell command line to change the content with the error/suggestions it gave you.

I did this because it's impossible for me to add the Text::Aspell Perl module but on the server aspell command line is available, and the pspell module of PHP is not included in my PHP yet.

The attached file works with the small limitation I gave. I hope to finish it today or tomorrow.

Thanks for this wonderful program.


mlall
Novice

Aug 13, 2003, 1:10 PM

Post #4 of 31 (7313 views)
Shortcut
Re: [mishoo] HTMLArea 3.0-Beta PHP SpellChecker 0.5 [In reply to] Can't Post

In your Perl spellchecker onh the Demo if you put the word "évalutation" french word for evaluation, the spell checker cut the word after the é instead of taking it with the word (it check only "valuation" Unimpressed ). This problem is present even if I ask to re-check in french.
The same kind of problem is present with other word with accent like "élève" where it cut the word and so check only "ve".


mishoo
User

Aug 13, 2003, 6:48 PM

Post #5 of 31 (7301 views)
Shortcut
Re: [mlall] HTMLArea 3.0-Beta PHP SpellChecker 0.5 [In reply to] Can't Post

Yes, it seems mine has such kind of problems. I'm not sure yet it's a problem of Aspell or my Perl code, but it happens. I don't have an workaround yet.
--
Mihai Bazon,
dynarch.com
Applied Web Standards


mlall
Novice

Aug 13, 2003, 7:22 PM

Post #6 of 31 (7296 views)
Shortcut
HTMLArea 3.0-Beta PHP SpellChecker 1.0 (need php pspell module) [In reply to] Can't Post

Updated (2003-08-18):

New version to download.

Here another version:

Requirement:
- PHP pspell module
- PEAR::XML_HTMLSax http://pear.php.net/package-info.php?package=XML_HTMLSax (tested only with v2.0.1):

- This version is not yet UTF8 friendly but work really great on english.
- This version also use a static list of dictionnary (There is no direct function in the PHP module to get the list of installed dictionaries).
- This version check the text inside a link. Wink

The PHP pspell module need to be compiled with PHP, or compiled, and activated in php.ini.


(This post was edited by mlall on Aug 18, 2003, 12:14 PM)
Attachments: SpellCheckerPHP_pspell_module.zip (3.53 KB)


bill@wemc.net
Novice

Aug 19, 2003, 8:09 AM

Post #7 of 31 (7205 views)
Shortcut
Re: [mlall] HTMLArea 3.0-Beta PHP SpellChecker 1.0 [In reply to] Can't Post

Got a very small and minor bug but don't know it anyone else noticed. If you run the PHP spell checker on a null entry (that is nothing in textarea) it returns a PHP error. Not a big deal and if I know more about PHP I would fix it, but alas, this is my first ever PHP script.


mlall
Novice

Aug 19, 2003, 11:00 AM

Post #8 of 31 (7198 views)
Shortcut
Re: [bill@wemc.net] HTMLArea 3.0-Beta PHP SpellChecker 1.0 [In reply to] Can't Post

Hi,

I added a test in the new attached version to start the spell checking only if the content is longer than 1 char. Let me know if it fixed your problem.

This is the php pspell module version not the command line version. If you need it for the command line version let me know.


In my version I can't have an empty content so I couldn't reproduce your problem. I always have at least a <br />.


(This post was edited by mlall on Aug 19, 2003, 11:25 AM)
Attachments: spell-check-logic.php (6.14 KB)


bill@wemc.net
Novice

Aug 19, 2003, 1:18 PM

Post #9 of 31 (7194 views)
Shortcut
Re: [mlall] HTMLArea 3.0-Beta PHP SpellChecker 1.0 [In reply to] Can't Post

I will need it for the command line version, but I would like to check out the psell module version. What's involved in getting that online.


mlall
Novice

Aug 19, 2003, 2:02 PM

Post #10 of 31 (7191 views)
Shortcut
Re: [bill@wemc.net] HTMLArea 3.0-Beta PHP SpellChecker 1.0 [In reply to] Can't Post

The php pspell module version require that the pspell module is active:

you can activate it during the php compilation by adding --with-pspell=/path/to/aspell/lib on the configure script option and changing the /path/to/aspell/lib by the path on your server to the aspell lib.

If your use a precompiled version maybe your distribution have a deb/rpm/tgz package to add the pspell module to php. once the module installed you need to check the php.ini file to be sure the module is loaded at runtime. If it's on windows the pspell module doesn't exist Unimpressed. For windows no choice you need to use the command line version.

you also need the PEAR::XML_HTMLSax package which is just a collection of PHP script to parse any HTML content. (It's a pear package so it require PEAR itself)

and I think that it.


I upload the changed command line version today.


mlall
Novice

Aug 19, 2003, 2:13 PM

Post #11 of 31 (7190 views)
Shortcut
Re: [bill@wemc.net] HTMLArea 3.0-Beta PHP SpellChecker 1.0 [In reply to] Can't Post


In Reply To
Got a very small and minor bug but don't know it anyone else noticed. If you run the PHP spell checker on a null entry (that is nothing in textarea) it returns a PHP error. Not a big deal and if I know more about PHP I would fix it, but alas, this is my first ever PHP script.


Can you give me the PHP error?


bill@wemc.net
Novice

Aug 19, 2003, 4:11 PM

Post #12 of 31 (7182 views)
Shortcut
Re: [mlall] HTMLArea 3.0-Beta PHP SpellChecker 1.0 [In reply to] Can't Post

Thanks. I'm Win32 so I guess it's the cmd line version for me.


bill@wemc.net
Novice

Aug 19, 2003, 4:18 PM

Post #13 of 31 (7180 views)
Shortcut
Re: [mlall] HTMLArea 3.0-Beta PHP SpellChecker 1.0 [In reply to] Can't Post


In Reply To
Can you give me the PHP error?


It's the darndes thing. I went to get it for you and now I can't reproduce it either.


mlall
Novice

Aug 19, 2003, 5:36 PM

Post #14 of 31 (7172 views)
Shortcut
Re: [bill@wemc.net] HTMLArea 3.0-Beta PHP SpellChecker 1.0 [In reply to] Can't Post

here is a slightly different version in the test of the content size to decide if we start the spell checker or not.
Attachments: spell-check-logic.php (3.87 KB)


bill@wemc.net
Novice

Aug 19, 2003, 10:08 PM

Post #15 of 31 (7167 views)
Shortcut
Re: [mlall] HTMLArea 3.0-Beta PHP SpellChecker 1.0 [In reply to] Can't Post

Thanks. I'm learning a lot by looking at this code.

I have another request: I use htmlarea a a few different content managment pages. On some of those pages I pass in something like this

cfg.pageStyle =
"body { background-color: BLACK; color: WHITE; }";

to change the color of the htmlarea background. Is there a way to pass that also to the spellchecker?

In this example I have a black background w/ white text. When I spell check it I get white text on a white background. If not a way to pass the htmal color can we force the text to be black on white w/o changing it?


mlall
Novice

Aug 20, 2003, 4:35 PM

Post #16 of 31 (7145 views)
Shortcut
Re: [bill@wemc.net] HTMLArea 3.0-Beta PHP SpellChecker 1.0 [In reply to] Can't Post

here is a solution maybe not the best one but it works:

- in the page which contain htmlarea add:

Code
var mystyle = 'body { background-color: BLACK; color: WHITE; }'; 
function getstyle()
{
return mystyle;
}

mystyle must contain the same think as your cfg.pageStyle (maybe it's possible to get directly the content of cfg.pageStyle I didn't try)

- in spell-check-ui.html:
you must add three thinks:
- 1st:

Code
<input type="hidden" name="style" id="style" />

in the form of this page. This is the form you change to use the php instead of the CGI.

- 2nd: add a name and an ID to the form:

Code
name="form1" id="form1"

Both name and id must be the same.

- 3rd: just after the form:

Code
<script type="text/javascript"> 
mystyle = opener.getstyle();
document.form1.style.value = mystyle;
</script>

This code call the opener window getstyle function which return the value of your mystyle variable. and after insert this value to in the form field we just add before.

- in spell-check-logic.php
Just before the </head> add:

Code
    <style type="text/css"> 
<?php echo $_POST['style']; ?>
</style>

this will print your style in the iframe of the spell checker to change the style of it. and so apply your style to the spell checker window also.

It's not perfect it's just a really quick and dirty hack.
Hope this help.


Denver Dave
User

Aug 20, 2003, 7:01 PM

Post #17 of 31 (7144 views)
Shortcut
Re: [mlall] HTMLArea 3.0-Beta PHP SpellChecker 1.0 [In reply to] Can't Post

"The php pspell module version require that the pspell module is active:
you can activate it during the php compilation by adding --with-pspell=/path/to/aspell/lib on the configure script option and changing the /path/to/aspell/lib by the path on your server to the aspell lib. "
"*****************
So does this mean if we are on a shared server and do not have access to recompile PHP that the PHP version is not the way to go. Can the PHP scripts be placed in the user directories instead?


mlall
Novice

Aug 20, 2003, 7:11 PM

Post #18 of 31 (7142 views)
Shortcut
Re: [Denver Dave] HTMLArea 3.0-Beta PHP SpellChecker 1.0 [In reply to] Can't Post

if your hosting company doesn't provide the pspell module (which you can find out if yes or not on a phpinfo page) you will have more chance to use the command line version in this same thread.

The command line version need just the aspell command line to be installed and a PEAR module (which is just some PHP scripts so you can install them). The aspell command line could already be installed on the server even if the pspell module is not. And you may be able to install aspell since it's "just" an executable with some data for the dictionnaries. But it depend on each hosting company.


redbeard
New User

Aug 21, 2003, 5:26 PM

Post #19 of 31 (7118 views)
Shortcut
Re: [mlall] HTMLArea 3.0-Beta PHP SpellChecker 1.0 [In reply to] Can't Post

Has anyone considered adding the ability to add words to a personal dictionary? The application I'm working on has lots of unique vocabulary which needs to be expanded in the live system. So an {i}Add Word button would be great.

Of course, you would have to be able to specify the dictionary (ang language, etc). And add the option to turn it on or off. For instance, in my application only certain people will be allowed to add words.

I haven't reached the point where I'm ready to start working on this, yet, but when I do (next week, maybe?) I can possibly add this functionality.

Michael


mlall
Novice

Aug 25, 2003, 12:09 PM

Post #20 of 31 (7062 views)
Shortcut
Re: [redbeard] HTMLArea 3.0-Beta PHP SpellChecker 1.0 [In reply to] Can't Post

I haven't try to add the add to a personal dictionary because I didn't need it. But that could be a really great addition to the spellchecker.


Tocobob
New User

Sep 3, 2003, 6:23 PM

Post #21 of 31 (6932 views)
Shortcut
Re: [mlall] HTMLArea 3.0-Beta PHP SpellChecker 1.0 [In reply to] Can't Post

Well, her's my hack to replace the cgi with PHP, wished I had read the forum before hacking it up. Anyway, I'll put it here just in case someone find's it usefull.

It used
  • PEAR: XML/Parser
  • PHP: Mod pspell (aspell)


Have hardcoded languages, tested with swedish and english, thus it have rudimentary support for utf-8.

I havn't tried any of the other PHP versions.. yet..Crazy
Attachments: spell-check-logic.php (2.15 KB)


jammjamm
Novice

Sep 17, 2003, 12:18 AM

Post #22 of 31 (6692 views)
Shortcut
Re: [mlall] HTMLArea 3.0-Beta PHP SpellChecker 1.0 [In reply to] Can't Post

Thanks, mlall, for this nice PHP version the spell checker...!

I'm having a problem, where if I spell-check a file that has some pretty complex HTML in it, the HTML tags get interrupted by the spell checker. For example, I have an image tag:

<img height="31" alt="air.html" src="/images/file.png" width="24" border="0">

but the speller inserts it's <span>s in the middle of the tag:

<img height="31" alt="air.<span class="HA-spellcheck-error>html</span><span class="HA-spellcheck-suggestions">HTML,ht ml</span>" src="/images/file.png" width="24" border="0">

It seems to interrupt ALT tags after a "." or an "_" (maybe in other situations too).

Any pointers about where in the spell-check-logic.php files I should look (I assume the logic is in there)? I'd like it to skip any checking inside "<" and ">" I think...

Thanks a million!
-Jamie


jammjamm
Novice

Sep 17, 2003, 11:28 AM

Post #23 of 31 (6671 views)
Shortcut
Re: [jammjamm] HTMLArea 3.0-Beta PHP SpellChecker 1.0 [In reply to] Can't Post

I think I figured it out: add " --rem-sgml-check=alt" to the $aspellcommand to make it skip checking alt tags...

Thanks... :)


mlall
Novice

Sep 24, 2003, 10:16 AM

Post #24 of 31 (6566 views)
Shortcut
Re: [jammjamm] HTMLArea 3.0-Beta PHP SpellChecker 1.0 [In reply to] Can't Post

sorry I didn't had time to come back earlier.

I'm glad you find the answer.

Thanks for the fix


squiz
Novice

Oct 9, 2003, 10:44 PM

Post #25 of 31 (6395 views)
Shortcut
Re: [mlall] HTMLArea 3.0-Beta PHP SpellChecker 1.0 [In reply to] Can't Post

I've done up a new version of the spell checker for the PHP Plugin Based Mod with another guy from my work. Its based on at least three versions floating around and requires PHP to be compiled with pspell and needs the XML_HTMLSax Pear module.

It fixes a few problems that I was having with the various versions I tried. The XML/Parser based solution was causing problems with malformed HTML (tables etc) and the XML_HTMLSax solution had a few problems with complex HTML.

There are probably some bugs floating around and there are still some things I want to put in to make it easier to use. Note that the dictionary selection list is gone because I dont like the idea of hard coding that sort of thing into the code because this will be distributed in our CMS. The dictionary will most likly become a config option for the plugin once I work out how to do that.

Check out a demo of the spell checker at http://dev.squiz.net/~gsherwood/htmlarea or take a look at the PHP plugin based forum here (where you can download the code).

Greg

First page Previous page 1 2 Next page Last page  View All
 
 


Search for (options)